Mailing List Archive

Seg fault on trunk r3834
What will i need to send to help you debug it.

I have built a small index and just wrote a simple searcher..
It seg faults when calling $hits->next();
$hits->total_hits returns a count of 4 just fine..

I'm currently on linux CentOS 5, perl v5.8.8,
Playing with Trunk r3834

-Dan

_______________________________________________
KinoSearch mailing list
KinoSearch@rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
Re: Seg fault on trunk r3834 [ In reply to ]
On Sep 6, 2008, at 11:36 PM, Dan wrote:

> What will i need to send to help you debug it.

Ideally, one of these:

* A self-contained script, patch, test case, or small tarball
containing code which isolates and demonstrates the problem.
* The output of "valgrind perl test_search.pl".

If neither of those is easy for you to produce, send some code
fragments to the list and we'll see if anything jumps out.

> I have built a small index and just wrote a simple searcher..
> It seg faults when calling $hits->next();
> $hits->total_hits returns a count of 4 just fine..

OK. Clearly $hits->next is used all over the test suite. So,
whatever is in your script isn't the kind of thing that the test
suite's generic usage of $hits->next picks up. I would appreciate
your help in finding this problem.

> I'm currently on linux CentOS 5, perl v5.8.8,
> Playing with Trunk r3834

Obligatory question: svn trunk passes all tests on this box, right?

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


_______________________________________________
KinoSearch mailing list
KinoSearch@rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
Re: Seg fault on trunk r3834 [ In reply to ]
> * A self-contained script, patch, test case, or small tarball
> containing code which isolates and demonstrates the problem.
----------- My Schema ----------
package EVDB::KinoSearch::VenueDeDup;
use base qw( KinoSearch::Schema );

use EVDB::KinoSearch::TEXT;
use EVDB::KinoSearch::SID;
use KinoSearch::Analysis::PolyAnalyzer;

our %fields = (
city_id => 'text',
override_id => 'text',
svid => 'text',
venue_name => 'text',
address => 'text',
bag => 'text',
);
sub analyzer{
return KinoSearch::Analysis::PolyAnalyzer->new( language => 'en' );
}
----------- Schema end ---------------

------- building the index----------
my $invindexer = KinoSearch::InvIndexer->new(
invindex => EVDB::KinoSearch::VenueDeDup->clobber('/usr1/tmp/invindex'),
);
do a few times ...
my $doc = KinoSearch::Doc->new( fields =>{
venue_name => $venue_name,
city_id => $city_id,
svid => $svid,
override_id => $override_id,
address => $address,
bag => $bag,
}
);
if($override_id){
$doc->set_boost(.9);
}
$invindexer->add_doc($doc);
......
$invindexer->finish( );

----------- END indexer --------

--------- Searcher ------
my $searcher = KinoSearch::Searcher->new(
invindex => EVDB::KinoSearch::VenueDeDup->read('/usr1/tmp/invindex'),
);

my $query_parser = KinoSearch::QueryParser->new(
schema =>
EVDB::KinoSearch::VenueDeDup->new,
fields => [ 'bag' ],
);
my $query = $query_parser->parse("A String");
my $hits = $searcher->search(query => $query, num_wanted => 10);
#my $hit_count = $hits->total_hits; ## <<--- works
#print "Hits:$hit_count\n";
while (my $hit = $hits->next()) { # <<-- Segfaults
#print join("\t",$hit->get_score, $hit->{title},
$hit->{svid},$hit->{override_id},) . "\n";
#print Dumper($hit);
}



> * The output of "valgrind perl test_search.pl".

[root@newbox test_venue_index]# valgrind perl -I /root/modules/
search_index.pl --venue_name "up"
==414== Memcheck, a memory error detector.
==414== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al.
==414== Using LibVEX rev 1658, a library for dynamic binary translation.
==414== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
==414== Using valgrind-3.2.1, a dynamic binary instrumentation framework.
==414== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al.
==414== For more details, rerun with: -v
==414==
==414== Invalid read of size 4
==414== at 0x457AB6C: kino_DocReader_fetch_doc (DocReader.c:54)
==414== by 0x4559E5C: kino_SegReader_fetch_doc (DocReader.h:244)
==414== by 0x453F1D3: kino_Searcher_fetch_doc (IndexReader.h:418)
==414== by 0x4564565: kino_Hits_next (Searchable.h:327)
==414== by 0x451EA47: XS_KinoSearch__Search__Hits__next (KinoSearch.xs:19321)
==414== by 0x5E14AC: Perl_pp_entersub (in
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so)
==414== by 0x5DA90E: Perl_runops_standard (in
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so)
==414== by 0x5800FD: perl_run (in
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so)
==414== by 0x80491ED: main (in /usr/bin/perl)
==414== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==414==
==414== Process terminating with default action of signal 11 (SIGSEGV)
==414== Access not within mapped region at address 0x0
==414== at 0x457AB6C: kino_DocReader_fetch_doc (DocReader.c:54)
==414== by 0x4559E5C: kino_SegReader_fetch_doc (DocReader.h:244)
==414== by 0x453F1D3: kino_Searcher_fetch_doc (IndexReader.h:418)
==414== by 0x4564565: kino_Hits_next (Searchable.h:327)
==414== by 0x451EA47: XS_KinoSearch__Search__Hits__next (KinoSearch.xs:19321)
==414== by 0x5E14AC: Perl_pp_entersub (in
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so)
==414== by 0x5DA90E: Perl_runops_standard (in
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so)
==414== by 0x5800FD: perl_run (in
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so)
==414== by 0x80491ED: main (in /usr/bin/perl)
==414==
==414== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 34 from 1)
==414== malloc/free: in use at exit: 2,819,366 bytes in 53,309 blocks.
==414== malloc/free: 111,004 allocs, 57,695 frees, 10,476,317 bytes allocated.
==414== For counts of detected errors, rerun with: -v
==414== searching for pointers to 53,309 not-freed blocks.
==414== checked 3,037,360 bytes.
==414==
==414== LEAK SUMMARY:
==414== definitely lost: 1,185 bytes in 25 blocks.
==414== possibly lost: 158 bytes in 12 blocks.
==414== still reachable: 2,818,023 bytes in 53,272 blocks.
==414== suppressed: 0 bytes in 0 blocks.
==414== Use --leak-check=full to see details of leaked memory.
Segmentation fault


> Obligatory question: svn trunk passes all tests on this box, right?

If you asking if "make test" passes all it's test... yes..
If there is something else i can check... i'll be happy to.


Thanks for the help..
I do want to use this for a few tools... I don't want to bother our
Java guy for a Lucene index every time I want to prototype a new
index. Who knows I may just leave a few in Kino.

-Dan

_______________________________________________
KinoSearch mailing list
KinoSearch@rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
Re: Seg fault on trunk r3834 [ In reply to ]
On Sep 7, 2008, at 11:39 AM, Dan wrote:

> ==414== Invalid read of size 4
> ==414== at 0x457AB6C: kino_DocReader_fetch_doc (DocReader.c:54)

Thanks to the Valgrind output, this was easy to track down.
DocReader, recently refactored, was freaking when it encountered a
field value of "". Somehow the test suite had managed to avoid
presenting DocReader with such a value up till now.

The segfault occurred at the SvPVX(value_sv) directive in the
following code from xs/KinoSearch/DocReader.c. SvPVX is a macro for
accessing an SV's string pointer directly -- it doesn't check first
whether the SV holds a valid string.

/* Read the field value. */
value_len = Kino_InStream_Read_C32(ds_in);
- value_sv = newSV(value_len);
+ value_sv = newSV((value_len ? value_len : 1));
Kino_InStream_Read_Bytes(ds_in, SvPVX(value_sv), value_len);

The solution was to guarantee that the SV contains a string by always
providing newSV() with a non-zero length.

Repository revision 3841 should resolve your issue.

Thanks for the report,

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


_______________________________________________
KinoSearch mailing list
KinoSearch@rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
Re: Seg fault on trunk r3834 [ In reply to ]
works..

Thanks for the help..
-Dan

_______________________________________________
KinoSearch mailing list
KinoSearch@rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch