Mailing List Archive

Would docvalues be loaded into jvm?
hi
I know that data is written into disk with the style of column-store if I
enable doc-values for certain field.
But I don't understand why sorting with docvalues doesn't increase the load
of jvm. whatever sorting algorithm , data would be loaded into jvm to sort.
This should be a high load for jvm when I sort all index , but no change
for jvm in fact. How does lucene sort with docvalues ? Can sort algorithm
work directly based on the file (Mmap) ?



--
View this message in context: http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-tp4340644.html
Sent from the Lucene - General mailing list archive at Nabble.com.
RE: Would docvalues be loaded into jvm? [ In reply to ]
Hi

It works directly off the mmapped files. It is not fully loaded into heap, only some small control structures are allocated on heap. During sorting the TopDocsCollector uses the memory mapped structures to uncompress and lookup the sort values.

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: wangqinghuan [mailto:1095193290@qq.com]
> Sent: Thursday, June 15, 2017 4:36 AM
> To: general@lucene.apache.org
> Subject: Would docvalues be loaded into jvm?
>
> hi
> I know that data is written into disk with the style of column-store if I
> enable doc-values for certain field.
> But I don't understand why sorting with docvalues doesn't increase the load
> of jvm. whatever sorting algorithm , data would be loaded into jvm to sort.
> This should be a high load for jvm when I sort all index , but no change
> for jvm in fact. How does lucene sort with docvalues ? Can sort algorithm
> work directly based on the file (Mmap) ?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Would-
> docvalues-be-loaded-into-jvm-tp4340644.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
Re: RE: Would docvalues be loaded into jvm? [ In reply to ]
hi
Is there any design document on this aspect (sorting algorithm off mmap)?



---Original---
From: "Uwe Schindler [via Lucene]"<ml+s472066n4340659h57@n3.nabble.com>
Date: 2017/6/15 14:39:30
To: "wangqinghuan"<1095193290@qq.com>;
Subject: RE: Would docvalues be loaded into jvm?


Hi

It works directly off the mmapped files. It is not fully loaded into heap, only some small control structures are allocated on heap. During sorting the TopDocsCollector uses the memory mapped structures to uncompress and lookup the sort values.

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: [hidden email]

> -----Original Message-----
> From: wangqinghuan [mailto:[hidden email]]
> Sent: Thursday, June 15, 2017 4:36 AM
> To: [hidden email]
> Subject: Would docvalues be loaded into jvm?
>
> hi
> I know that data is written into disk with the style of column-store if I
> enable doc-values for certain field.
> But I don't understand why sorting with docvalues doesn't increase the load
> of jvm. whatever sorting algorithm , data would be loaded into jvm to sort.
> This should be a high load for jvm when I sort all index , but no change
> for jvm in fact. How does lucene sort with docvalues ? Can sort algorithm
> work directly based on the file (Mmap) ?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Would-
> docvalues-be-loaded-into-jvm-tp4340644.html
> Sent from the Lucene - General mailing list archive at Nabble.com.




If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-tp4340644p4340659.html
To unsubscribe from Would docvalues be loaded into jvm?, click here.
NAML



--
View this message in context: http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-tp4340644p4340667.html
Sent from the Lucene - General mailing list archive at Nabble.com.
RE: RE: Would docvalues be loaded into jvm? [ In reply to ]
Hi,

There is no design document about that. Lucene uses MMAP for all index files since a long time ago. DocValues is just another implementation. Basically it uses IndexInput's methods to access the underlying data, which is memory mapped if you are on 64 bit platforms. For DocValues there are also positional reads available. There is not much stuff specifically for docvalues, it is just a file format that supports column based access with positional reads. The mmap implementation is separated from this and a bit lower in the I/O layer of Lucene. Sorting is just a use case of DocValues, but it does not sort directly on the mmapped files, there are several abstractions inbetween (which are of course removed by the Hotspot optimizer).

Some information (a bit older, but still valid) is here: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: wangqinghuan [mailto:1095193290@qq.com]
> Sent: Thursday, June 15, 2017 10:41 AM
> To: general@lucene.apache.org
> Subject: Re: RE: Would docvalues be loaded into jvm?
>
> hi
> Is there any design document on this aspect (sorting algorithm off mmap)?
>
>
>
> ---Original---
> From: "Uwe Schindler [via
> Lucene]"<ml+s472066n4340659h57@n3.nabble.com>
> Date: 2017/6/15 14:39:30
> To: "wangqinghuan"<1095193290@qq.com>;
> Subject: RE: Would docvalues be loaded into jvm?
>
>
> Hi
>
> It works directly off the mmapped files. It is not fully loaded into heap, only
> some small control structures are allocated on heap. During sorting the
> TopDocsCollector uses the memory mapped structures to uncompress and
> lookup the sort values.
>
> Uwe
>
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: [hidden email]
>
> > -----Original Message-----
> > From: wangqinghuan [mailto:[hidden email]]
> > Sent: Thursday, June 15, 2017 4:36 AM
> > To: [hidden email]
> > Subject: Would docvalues be loaded into jvm?
> >
> > hi
> > I know that data is written into disk with the style of column-store if I
> > enable doc-values for certain field.
> > But I don't understand why sorting with docvalues doesn't increase the
> load
> > of jvm. whatever sorting algorithm , data would be loaded into jvm to sort.
> > This should be a high load for jvm when I sort all index , but no change
> > for jvm in fact. How does lucene sort with docvalues ? Can sort algorithm
> > work directly based on the file (Mmap) ?
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.nabble.com/Would-
> > docvalues-be-loaded-into-jvm-tp4340644.html
> > Sent from the Lucene - General mailing list archive at Nabble.com.
>
>
>
>
> If you reply to this email, your message will be added to the discussion
> below:
> http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-
> tp4340644p4340659.html
> To unsubscribe from Would docvalues be loaded into jvm?, click here.
> NAML
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Would-
> docvalues-be-loaded-into-jvm-tp4340644p4340667.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
Re: RE: RE: Would docvalues be loaded into jvm? [ In reply to ]
Does "hotspot" reffers to java virtual machine?



---Original---
From: "Uwe Schindler [via Lucene]"<ml+s472066n4340678h46@n3.nabble.com>
Date: 2017/6/15 17:03:46
To: "wangqinghuan"<1095193290@qq.com>;
Subject: RE: RE: Would docvalues be loaded into jvm?


Hi,

There is no design document about that. Lucene uses MMAP for all index files since a long time ago. DocValues is just another implementation. Basically it uses IndexInput's methods to access the underlying data, which is memory mapped if you are on 64 bit platforms. For DocValues there are also positional reads available. There is not much stuff specifically for docvalues, it is just a file format that supports column based access with positional reads. The mmap implementation is separated from this and a bit lower in the I/O layer of Lucene. Sorting is just a use case of DocValues, but it does not sort directly on the mmapped files, there are several abstractions inbetween (which are of course removed by the Hotspot optimizer).

Some information (a bit older, but still valid) is here: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: [hidden email]

> -----Original Message-----
> From: wangqinghuan [mailto:[hidden email]]
> Sent: Thursday, June 15, 2017 10:41 AM
> To: [hidden email]
> Subject: Re: RE: Would docvalues be loaded into jvm?
>
> hi
> Is there any design document on this aspect (sorting algorithm off mmap)?
>
>
>
> ---Original---
> From: "Uwe Schindler [via
> Lucene]"<[hidden email]>
> Date: 2017/6/15 14:39:30
> To: "wangqinghuan"<[hidden email]>;
> Subject: RE: Would docvalues be loaded into jvm?
>
>
> Hi
>
> It works directly off the mmapped files. It is not fully loaded into heap, only
> some small control structures are allocated on heap. During sorting the
> TopDocsCollector uses the memory mapped structures to uncompress and
> lookup the sort values.
>
> Uwe
>
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: [hidden email]
>
> > -----Original Message-----
> > From: wangqinghuan [mailto:[hidden email]]
> > Sent: Thursday, June 15, 2017 4:36 AM
> > To: [hidden email]
> > Subject: Would docvalues be loaded into jvm?
> >
> > hi
> > I know that data is written into disk with the style of column-store if I
> > enable doc-values for certain field.
> > But I don't understand why sorting with docvalues doesn't increase the
> load
> > of jvm. whatever sorting algorithm , data would be loaded into jvm to sort.
> > This should be a high load for jvm when I sort all index , but no change
> > for jvm in fact. How does lucene sort with docvalues ? Can sort algorithm
> > work directly based on the file (Mmap) ?
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.nabble.com/Would-
> > docvalues-be-loaded-into-jvm-tp4340644.html
> > Sent from the Lucene - General mailing list archive at Nabble.com.
>
>
>
>
> If you reply to this email, your message will be added to the discussion
> below:
> http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-
> tp4340644p4340659.html
> To unsubscribe from Would docvalues be loaded into jvm?, click here.
> NAML
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Would-
> docvalues-be-loaded-into-jvm-tp4340644p4340667.html
> Sent from the Lucene - General mailing list archive at Nabble.com.




If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-tp4340644p4340678.html
To unsubscribe from Would docvalues be loaded into jvm?, click here.
NAML



--
View this message in context: http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-tp4340644p4340689.html
Sent from the Lucene - General mailing list archive at Nabble.com.
RE: RE: RE: Would docvalues be loaded into jvm? [ In reply to ]
Yes.

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: wangqinghuan [mailto:1095193290@qq.com]
> Sent: Thursday, June 15, 2017 12:21 PM
> To: general@lucene.apache.org
> Subject: Re: RE: RE: Would docvalues be loaded into jvm?
>
> Does "hotspot" reffers to java virtual machine?
>
>
>
> ---Original---
> From: "Uwe Schindler [via
> Lucene]"<ml+s472066n4340678h46@n3.nabble.com>
> Date: 2017/6/15 17:03:46
> To: "wangqinghuan"<1095193290@qq.com>;
> Subject: RE: RE: Would docvalues be loaded into jvm?
>
>
> Hi,
>
> There is no design document about that. Lucene uses MMAP for all index
> files since a long time ago. DocValues is just another implementation.
> Basically it uses IndexInput's methods to access the underlying data, which is
> memory mapped if you are on 64 bit platforms. For DocValues there are also
> positional reads available. There is not much stuff specifically for docvalues,
> it is just a file format that supports column based access with positional
> reads. The mmap implementation is separated from this and a bit lower in
> the I/O layer of Lucene. Sorting is just a use case of DocValues, but it does
> not sort directly on the mmapped files, there are several abstractions
> inbetween (which are of course removed by the Hotspot optimizer).
>
> Some information (a bit older, but still valid) is here:
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
> Uwe
>
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: [hidden email]
>
> > -----Original Message-----
> > From: wangqinghuan [mailto:[hidden email]]
> > Sent: Thursday, June 15, 2017 10:41 AM
> > To: [hidden email]
> > Subject: Re: RE: Would docvalues be loaded into jvm?
> >
> > hi
> > Is there any design document on this aspect (sorting algorithm off mmap)?
> >
> >
> >
> > ---Original---
> > From: "Uwe Schindler [via
> > Lucene]"<[hidden email]>
> > Date: 2017/6/15 14:39:30
> > To: "wangqinghuan"<[hidden email]>;
> > Subject: RE: Would docvalues be loaded into jvm?
> >
> >
> > Hi
> >
> > It works directly off the mmapped files. It is not fully loaded into heap,
> only
> > some small control structures are allocated on heap. During sorting the
> > TopDocsCollector uses the memory mapped structures to uncompress and
> > lookup the sort values.
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > Achterdiek 19, D-28357 Bremen
> > http://www.thetaphi.de
> > eMail: [hidden email]
> >
> > > -----Original Message-----
> > > From: wangqinghuan [mailto:[hidden email]]
> > > Sent: Thursday, June 15, 2017 4:36 AM
> > > To: [hidden email]
> > > Subject: Would docvalues be loaded into jvm?
> > >
> > > hi
> > > I know that data is written into disk with the style of column-store if I
> > > enable doc-values for certain field.
> > > But I don't understand why sorting with docvalues doesn't increase the
> > load
> > > of jvm. whatever sorting algorithm , data would be loaded into jvm to
> sort.
> > > This should be a high load for jvm when I sort all index , but no change
> > > for jvm in fact. How does lucene sort with docvalues ? Can sort algorithm
> > > work directly based on the file (Mmap) ?
> > >
> > >
> > >
> > > --
> > > View this message in context:
> http://lucene.472066.n3.nabble.com/Would-
> > > docvalues-be-loaded-into-jvm-tp4340644.html
> > > Sent from the Lucene - General mailing list archive at Nabble.com.
> >
> >
> >
> >
> > If you reply to this email, your message will be added to the discussion
> > below:
> > http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-
> jvm-
> > tp4340644p4340659.html
> > To unsubscribe from Would docvalues be loaded into jvm?, click here.
> > NAML
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.nabble.com/Would-
> > docvalues-be-loaded-into-jvm-tp4340644p4340667.html
> > Sent from the Lucene - General mailing list archive at Nabble.com.
>
>
>
>
> If you reply to this email, your message will be added to the discussion
> below:
> http://lucene.472066.n3.nabble.com/Would-docvalues-be-loaded-into-jvm-
> tp4340644p4340678.html
> To unsubscribe from Would docvalues be loaded into jvm?, click here.
> NAML
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Would-
> docvalues-be-loaded-into-jvm-tp4340644p4340689.html
> Sent from the Lucene - General mailing list archive at Nabble.com.