Mailing List Archive

test code
> From: Brian Goetz [mailto:brian@quiotix.com]
>
> I'd like to see the existing test programs converted into
> JUnit test cases
> -- I'm willing to do this if someone will tell me how they
> work and what
> they're supposed to output and how to invoke them.

These are mostly things that I wrote years ago when first developing Lucene,
e.g., to test the index code before the search code was written, etc. They
were mostly never really standalone test programs, but rather things whose
output I would eyeball for correctness (e.g. DocTest), things that I would
use for benchmarking and profiling when exploring different implementations
of things (e.g. IndexTest, TermEnumTest). Some serve both purposes
(AnalysisTest, PriorityQueueTest). Still others are stress tests
(ThreadSafetyTest). If you have questions about a particular one I'd be
glad to tell you what I can remember about it.

I agree that we should build a test suite. This code could provide a
starting point, but some work will be required. Useful tests might be:
- Storage Tests: check that FSDirectory and RAMDirectory can write, read
and seek files of various sizes.
- Analysis Tests: check that various analyzers generate the expected
tokens.
- Index Tests: build an index based on generated or static data, and check
that it contains the information that it should.
- Search Tests: make sure that searches return all the matching documents.

Perhaps different folks can volunteer to write tests for different areas of
the code? I would be happy to do any of one of these, but not all four!

In addition it would be good to have some performance tests. We could
either download a reference collection, or use a synthetic document
generator that could efficiently generate streams of terms distributed
according to Zipf's law. A generator-based approach would let you specify,
e.g. average document length. Then we could time analysis, index and search
tasks over reasonably large indexes. This could be done periodically to
make sure that changes don't make things slower, and would be useful for
benchmarking Lucene performance on various platforms. Does anyone have a
good document generator? Or can anyone suggest a good, preferably
plain-text, static test collection?

Doug

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: test code [ In reply to ]
>These are mostly things that I wrote years ago when first developing Lucene,
>e.g., to test the index code before the search code was written, etc. They
>were mostly never really standalone test programs, but rather things whose
>output I would eyeball for correctness (e.g. DocTest), things that I would
>use for benchmarking and profiling when exploring different implementations
>of things (e.g. IndexTest, TermEnumTest). Some serve both purposes
>(AnalysisTest, PriorityQueueTest). Still others are stress tests
>(ThreadSafetyTest). If you have questions about a particular one I'd be
>glad to tell you what I can remember about it.

Maybe a good start would be to look at those tests and see which you think
you can recast in terms of tests that produce a "it worked / it didn't"
answer.

>I agree that we should build a test suite. This code could provide a
>starting point, but some work will be required. Useful tests might be:
> - Storage Tests: check that FSDirectory and RAMDirectory can write, read
>and seek files of various sizes.
> - Analysis Tests: check that various analyzers generate the expected
>tokens.
> - Index Tests: build an index based on generated or static data, and check
>that it contains the information that it should.
> - Search Tests: make sure that searches return all the matching documents.

The analysis tests could easily be done with JUnit. I'll take on some of
those. The others are mostly functional tests, although you could probably
build a JUnit case that handles those too.

I'd also like to see many more unit tests; the PriorityQueueTest is a good
example of this. JUnit has saved our bacon so many times in other projects
that its really worth the effort. And anyone who commits without running
the unit test suite gets publically ridiculed, since its so easy
(just 'ant test-unit').

>In addition it would be good to have some performance tests. We could
>either download a reference collection, or use a synthetic document
>generator that could efficiently generate streams of terms distributed
>according to Zipf's law.

Of course, the correctness tests and performance tests could both run
against the same reference collections.

>Or can anyone suggest a good, preferably
>plain-text, static test collection?

The HTML output from the JavaDoc for JDK 1.3 isn't a bad candidate.


--
Brian Goetz
Quiotix Corporation
brian@quiotix.com Tel: 650-843-1300 Fax: 650-324-8032

http://www.quiotix.com


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: test code [ In reply to ]
> >Or can anyone suggest a good, preferably
> >plain-text, static test collection?
>
> The HTML output from the JavaDoc for JDK 1.3 isn't a bad candidate.

Or Lucene's Javadoc, so you kill two birds with a single stone.

Otis


__________________________________________________
Do You Yahoo!?
Find a job, post your resume.
http://careers.yahoo.com

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: test code [ In reply to ]
>Or Lucene's Javadoc, so you kill two birds with a single stone.

That's not static, though.


--
Brian Goetz
Quiotix Corporation
brian@quiotix.com Tel: 650-843-1300 Fax: 650-324-8032

http://www.quiotix.com


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: test code [ In reply to ]
> >Or Lucene's Javadoc, so you kill two birds with a single stone.
>
> That's not static, though.

Ah, that would be a requirement, right.
It would be a nice demo, though :)

Otis


__________________________________________________
Do You Yahoo!?
Find a job, post your resume.
http://careers.yahoo.com

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>