sniff the IP and then using the database at the
internet topology website
http://netgeo.caida.org/perl/netgeo.cgi you can find the country of origin, (use that to populate your
own DB) so retrieval decreases as you accumulate IPs), but that will
give you the website in Italy (not Italian websites). Unfortunately unless
Italian
uses a different encoding for the page, picking it up from the page
(JavaScript)
won't help much.
-----Original Message-----
From: lucene@libero.it [mailto:lucene@libero.it]
Sent: Wednesday, April 24, 2002 1:03 PM
To: lucene-user@jakarta.apache.org
Subject: Italian web sites
Hi all,
I'm using Jobo for spidering web sites and lucene for indexing. The
problem is that I'd like spidering only Italian web sites.
How can I see discover the country of a web site?
Dou you know some method that tou can suggest me?
Thanks
Laura
--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>