Mailing List Archive

r3749 - in trunk: c_src/KinoSearch/Analysis perl perl/lib
Author: creamyg
Date: 2008-08-23 12:22:27 -0700 (Sat, 23 Aug 2008)
New Revision: 3749

Modified:
trunk/c_src/KinoSearch/Analysis/PolyAnalyzer.bp
trunk/perl/Build.PL
trunk/perl/lib/KinoSearch.pm
Log:
Bump the versions required for Lingua::Stem::Snowball and Lingua::StopWords.
Since L::S::Snowball 0.951 has the DynaLoader symbol exporting code, it's no
longer necessary to grab it from the rectangular.com Subversion repository.
Also, add Hungarian, Romanian, and Turkish to the support list, sins
L::S::Snowball 0.951 supports them.


Modified: trunk/c_src/KinoSearch/Analysis/PolyAnalyzer.bp
===================================================================
--- trunk/c_src/KinoSearch/Analysis/PolyAnalyzer.bp 2008-08-23 19:03:24 UTC (rev 3748)
+++ trunk/c_src/KinoSearch/Analysis/PolyAnalyzer.bp 2008-08-23 19:22:27 UTC (rev 3749)
@@ -18,12 +18,15 @@
* es => Spanish,
* fi => Finnish,
* fr => French,
+ * hu => Hungarian,
* it => Italian,
* nl => Dutch,
* no => Norwegian,
* pt => Portuguese,
+ * ro => Romanian,
* ru => Russian,
* sv => Swedish,
+ * tr => Turkish,
*/
class KinoSearch::Analysis::PolyAnalyzer
extends KinoSearch::Analysis::Analyzer {

Modified: trunk/perl/Build.PL
===================================================================
--- trunk/perl/Build.PL 2008-08-23 19:03:24 UTC (rev 3748)
+++ trunk/perl/Build.PL 2008-08-23 19:22:27 UTC (rev 3749)
@@ -40,8 +40,8 @@
dist_version_from => 'lib/KinoSearch.pm',
requires => {
'Compress::Zlib' => 0,
- 'Lingua::Stem::Snowball' => 0.94,
- 'Lingua::StopWords' => 0.06,
+ 'Lingua::Stem::Snowball' => 0.951,
+ 'Lingua::StopWords' => 0.09,
'JSON::XS' => 2.01,
},
build_requires => {

Modified: trunk/perl/lib/KinoSearch.pm
===================================================================
--- trunk/perl/lib/KinoSearch.pm 2008-08-23 19:03:24 UTC (rev 3748)
+++ trunk/perl/lib/KinoSearch.pm 2008-08-23 19:22:27 UTC (rev 3749)
@@ -91,7 +91,7 @@
my ( undef, $language ) = @_;
require Lingua::StopWords;
$language = lc($language);
- if ( $language =~ /^(?:da|de|en|es|fi|fr|it|nl|no|pt|ru|sv)$/ ) {
+ if ( $language =~ /^(?:da|de|en|es|fi|fr|hu|it|nl|no|pt|ru|sv)$/ ) {
my $stoplist
= Lingua::StopWords::getStopWords( $language, 'UTF-8' );
return to_kino($stoplist);
@@ -778,8 +778,8 @@
output strings use Perl's internal Unicode encoding. For use of KinoSearch
with non-Latin-1 material, see L<Encode>.

-KinoSearch provides "native support" for 12 languages, meaning that a stemmer
-and a stoplist are available, and PolyAnalyzer supports them.
+KinoSearch provides "native support" for 15 languages, meaning that
+PolyAnalyzer supports them.

=over

@@ -809,6 +809,10 @@

=item *

+Hungarian
+
+=item *
+
Italian

=item *
@@ -821,6 +825,10 @@

=item *

+Romanian
+
+=item *
+
Russian

=item *
@@ -831,6 +839,10 @@

Swedish

+=item *
+
+Turkish
+
=back

KinoSearch can also be extended to support other languages if you write your


_______________________________________________
kinosearch-commits mailing list
kinosearch-commits@rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch-commits