Matias Pelenur wrote:
> Hi all,
> Where are Search stop-words stored? I have the suspicion that the word
> "gay" may be set as a stop-word by default, because on my MediaWiki
> setup searching for it gives no results, even though it appears on at
> least one page, and searching for any other words on that page show up
> in the search page.
There is a stopword list hard-coded into MySQL. (In 4.0 and up this can
be overridden by server-wide configuration.) MediaWiki includes a copy
of the default stopword list (FulltextStoplist.php) in order to take
them out of multiple-word searches (so if you search for "the united
nations", it will search only "united" and "nations", rather than
searching "the", returning no results, and thus not matching anything
for "united" or "nations" either). I think this is only used in MySQL 3
mode, so if you configured on MySQL 4 it won't use this mode, and it's
up to the list actually in MySQL.
Also words appearing in over 50% of the search space will not match;
this can affect very small databases particularly.
However your problem is likely the minimum word length limit; I believe
the default is four characters, so "gay" would not be found, nor would
"tea" or "gun" or "war" or "hat".
For the MySQL 3 mode we again trim out short words before putting them
to the search engine; you can override this by setting the variable
$wgDBminWordLen in LocalSettings.php. I don't think the check is done in
MySQL 4 mode, since it works differently using a more advanced mode in
the MySQL engine, but I'm not sure offhand. However you still may need
to adjust MySQL itself, see:
http://dev.mysql.com/doc/mysql/en/Fulltext_Fine-tuning.html -- brion vibber (brion @ pobox.com)