-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Jul 15, 2008, at 8:04 PM, Marvin Humphrey wrote:
> Hello, Michael,
>
>> I'm using KinoSearch to develop a search engine for the IRC logs I
>> have browsable on my website. I am currently using the developer
>> release, version 0.20_051 due to a need for non-score based sorting
>> (sort by date). I am very pleased with KinoSearch so far. For IRC
>> logs it makes the most sense to break on line breaks versus periods
>> for excerpts. This is an easy one line patch[1] in Highlight/
>> Highlighter.pm but it seems a bit overkill to subclass Highlighter
>> for a one line patch to _gen_excerpt.
>>
>> Perhaps it may make sense to have an argument that allows you to
>> specify a character/string to prefer breaking on that defaults to
>> '\.'. Allowing RegEx syntax would be most flexible and I think
>> most overriding the default wouldn't have an issue escaping things
>> but you are the author ;). I'm really not sure what other than
>> periods and new-lines someone may want to break on, perhaps tabs,
>> so would definitely understand should you decide this is a feature
>> request that wouldn't be used widely enough to merit inclusion.
>
> Sorry for the delayed response.
Not a problem.
> I've been working on Highlighter lately, and I think the answer is
> to define a couple methods that the user can override:
> find_sentence_boundaries() and raw_excerpt(). If you're interested
> in discussing API design for those, we should take up the matter on
> the KinoSearch mailing list: <http://www.rectangular.com/mailman/listinfo/kinosearch/
> >
Indeed, this makes sense and allows for even more specialization.
Re Mailing List: Yes, I fail, too used to the small modules without
the lists, subscribed a couple of days ago to this and CCing this
reply there.
Mike
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
iD8DBQFIffZl0Qbp4bPZvesRAryNAKCFNBWbExBIxMpJc9ZqlIdrbOGgbACeNw69
QFU7BwgJGgoscT6k+7sVH1E=
=DkXV
-----END PGP SIGNATURE-----
_______________________________________________
KinoSearch mailing list
KinoSearch@rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
Hash: SHA1
On Jul 15, 2008, at 8:04 PM, Marvin Humphrey wrote:
> Hello, Michael,
>
>> I'm using KinoSearch to develop a search engine for the IRC logs I
>> have browsable on my website. I am currently using the developer
>> release, version 0.20_051 due to a need for non-score based sorting
>> (sort by date). I am very pleased with KinoSearch so far. For IRC
>> logs it makes the most sense to break on line breaks versus periods
>> for excerpts. This is an easy one line patch[1] in Highlight/
>> Highlighter.pm but it seems a bit overkill to subclass Highlighter
>> for a one line patch to _gen_excerpt.
>>
>> Perhaps it may make sense to have an argument that allows you to
>> specify a character/string to prefer breaking on that defaults to
>> '\.'. Allowing RegEx syntax would be most flexible and I think
>> most overriding the default wouldn't have an issue escaping things
>> but you are the author ;). I'm really not sure what other than
>> periods and new-lines someone may want to break on, perhaps tabs,
>> so would definitely understand should you decide this is a feature
>> request that wouldn't be used widely enough to merit inclusion.
>
> Sorry for the delayed response.
Not a problem.
> I've been working on Highlighter lately, and I think the answer is
> to define a couple methods that the user can override:
> find_sentence_boundaries() and raw_excerpt(). If you're interested
> in discussing API design for those, we should take up the matter on
> the KinoSearch mailing list: <http://www.rectangular.com/mailman/listinfo/kinosearch/
> >
Indeed, this makes sense and allows for even more specialization.
Re Mailing List: Yes, I fail, too used to the small modules without
the lists, subscribed a couple of days ago to this and CCing this
reply there.
Mike
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
iD8DBQFIffZl0Qbp4bPZvesRAryNAKCFNBWbExBIxMpJc9ZqlIdrbOGgbACeNw69
QFU7BwgJGgoscT6k+7sVH1E=
=DkXV
-----END PGP SIGNATURE-----
_______________________________________________
KinoSearch mailing list
KinoSearch@rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch