Mailing List Archive

Our search engine syntax/semantics
By parsing the search string, we get the benefit of full boolean
search, which is definitely cool. But our current regime has one
severe downside: any search which has a single short word or stopword
or non-existent word in it will automatically fail since we assume an
implicit "AND" between all search terms. The unadorned mysql "MATCH"
operator does not have this downside: if you search for "the chinese
wall" for example, "the" will be silently ignored and you get the
expected hit.

I am wondering if we can combine the best of those worlds. This should
decrease the number of complaints about short search terms
dramatically, maybe even to the point that we can keep the current
index size.

How about this: every subsequence of the query string which doesn't
contain any +/- boolean operators (see below) is passed to the MATCH
operator as is, which assumes an implicit "give me the best matches
you can find for these terms, the more matching terms the better".
Then we could have two additional operators: + and -. If a word is
preced by +, it *must* be presents, if a word is preceded by - it
*cannot* be present. That allows to express any complicated query we
can right now, but should result in much fewer failed searches. Does
that seem feasible?

Axel
Re: Our search engine syntax/semantics [ In reply to ]
Is there a way to create a link directly to a Wikipedia search using
the new software? In the old system, I used to link to

http://www.wikipedia.com/search.fcgi?request=Oregon

but that doesn't work anymore (and it shouldn't, since the old
system isn't used anymore).

Linking to searches is very useful in some cases. Google allows this
(http://google.com/search?q=Oregon) and most Wikis do.


--
Lars Aronsson (lars@aronsson.se)
Aronsson Datateknik
Teknikringen 1e, SE-583 30 Linuxköping, Sweden
tel +46-70-7891609
http://aronsson.se/ http://elektrosmog.nu/ http://susning.nu/
Re: Our search engine syntax/semantics [ In reply to ]
On mar, 2002-03-12 at 12:49, Lars Aronsson wrote:
> Is there a way to create a link directly to a Wikipedia search using
> the new software? In the old system, I used to link to
>
> http://www.wikipedia.com/search.fcgi?request=Oregon
>
> but that doesn't work anymore (and it shouldn't, since the old
> system isn't used anymore).

Sure:

http://www.wikipedia.com/wiki.phtml?search=Oregon
or
http://www.wikipedia.com/wiki/&search=Oregon

(Is there any reason we do searches with POST rather than GET? You can
always construct the URL manually, but it's nicer to be able to
cut-n-paste.)

-- brion vibber (brion @ pobox.com)
Re: Our search engine syntax/semantics [ In reply to ]
Brion L. VIBBER wrote:
> http://www.wikipedia.com/wiki.phtml?search=Oregon

Thanks, this works great!


--
Lars Aronsson (lars@aronsson.se)
Aronsson Datateknik
Teknikringen 1e, SE-583 30 Linuxköping, Sweden
tel +46-70-7891609
http://aronsson.se/ http://elektrosmog.nu/ http://susning.nu/