There appears to be a memory leak in add_batch, which I've been unable to
track down. If I comment out the XS code in PostingsWriter that calls:
kino_PostWriter_add_batch(self, batch, &field_name, doc_num, doc_boost, length_norm);
My indexing process stays at around 45M, vs 100M+ with that call enabled for
14k documents.
Unfortunately I don't understand where kino_PostWriter_add_batch() is defined.
Not doing anything too special.. subclassing ::Schema, with 4 fields, one of
which is non-indexed. There doesn't seem to be a way to periodically flush
the index to disk (and out of memory).
Using current svn tree (0.20-4)
-D
--
<dsully> please describe web 2.0 to me in 2 sentences or less.
<jwb> you make all the content. they keep all the revenue.
track down. If I comment out the XS code in PostingsWriter that calls:
kino_PostWriter_add_batch(self, batch, &field_name, doc_num, doc_boost, length_norm);
My indexing process stays at around 45M, vs 100M+ with that call enabled for
14k documents.
Unfortunately I don't understand where kino_PostWriter_add_batch() is defined.
Not doing anything too special.. subclassing ::Schema, with 4 fields, one of
which is non-indexed. There doesn't seem to be a way to periodically flush
the index to disk (and out of memory).
Using current svn tree (0.20-4)
-D
--
<dsully> please describe web 2.0 to me in 2 sentences or less.
<jwb> you make all the content. they keep all the revenue.