Mailing List Archive

Zope.org currently unusable
Hi,

I am trying to make a new Zope release but this site is completely
unresponsive. Copying and renaming folders and files makes more than 10
minutes or fails totally....is this a hardware problem or just a result of
this Plone */(&§(&(& crap?

Andreas
Re: Zope.org currently unusable [ In reply to ]
On Mar 9, 2005, at 20:10, Andreas Jung wrote:

> Hi,
>
> I am trying to make a new Zope release but this site is completely
> unresponsive. Copying and renaming folders and files makes more than
> 10 minutes or fails totally....is this a hardware problem or just a
> result of this Plone */(&§(&(& crap?

I've gotten spammed way more often than normal today by the email
messages sent out by nagios when the ZEO clients restart. There seems
to be some problem today, you're right.

I would suggest you try again in the morning when there is less
US-based traffic. I know, that's not a solution, that's just avoiding
the symptoms.

jens

_______________________________________________
Zope-web maillist - Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web
RE: Zope.org currently unusable [ In reply to ]
It's a little of both; there's a group of people working on this - we hope
to have something real soon now :) as a fix. Jens, could do you have the
time to check the zope.org robots.txt? A lot of the problems I've seen
recently were due to several robots spidering zope.org at a time. I'm
working on additional hardware and we should see more traction on the
project sooner then later.
Andrew

--
Zope Managed Hosting
Software Engineer
Zope Corporation
(540) 361-1700
> -----Original Message-----
> From: zope-web-bounces@zope.org [mailto:zope-web-bounces@zope.org] On
> Behalf Of Jens Vagelpohl
> Sent: Wednesday, March 09, 2005 6:39 PM
> To: zope-web@zope.org
> Subject: Re: [ZWeb] Zope.org currently unusable
>
>
> On Mar 9, 2005, at 20:10, Andreas Jung wrote:
>
> > Hi,
> >
> > I am trying to make a new Zope release but this site is completely
> > unresponsive. Copying and renaming folders and files makes more than
> > 10 minutes or fails totally....is this a hardware problem or just a
> > result of this Plone */(&§(&(& crap?
>
> I've gotten spammed way more often than normal today by the email
> messages sent out by nagios when the ZEO clients restart. There seems
> to be some problem today, you're right.
>
> I would suggest you try again in the morning when there is less
> US-based traffic. I know, that's not a solution, that's just avoiding
> the symptoms.
>
> jens
>
> _______________________________________________
> Zope-web maillist - Zope-web@zope.org
> http://mail.zope.org/mailman/listinfo/zope-web

_______________________________________________
Zope-web maillist - Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web
Re: Zope.org currently unusable [ In reply to ]
On Mar 10, 2005, at 2:18, Andrew Sawyers wrote:

> It's a little of both; there's a group of people working on this - we
> hope
> to have something real soon now :) as a fix. Jens, could do you have
> the
> time to check the zope.org robots.txt? A lot of the problems I've seen
> recently were due to several robots spidering zope.org at a time. I'm
> working on additional hardware and we should see more traction on the
> project sooner then later.

I don't believe all that much in robots.txt. The nasty bots completely
ignore it, anyway. The only way to deal with them is to block them with
e.g. iptables.

What's currently there looks odd:

"""
User-agent: wget
Disallow: /

User-agent: Wget
Disallow: /

# Ask Google to skip search queries and the like.
User-agent: Googlebot
Disallow: /*?
"""

Looking at the spec the case sensitivity of the User-agent value is
only "recommended", but you could shorten that into the following,
because multiple User-agent values are allowed per rule set:

"""
User-agent: wget
User-agent: Wget
Disallow: /
"""

Otherwise there really isn't much in there, and from seeing googlebots
myself often enough I have my doubts whether the line "Disallow: /*?"
works at all.

jens

_______________________________________________
Zope-web maillist - Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web
Re: Zope.org currently unusable [ In reply to ]
Hi,

I recommend adding crawl delays for all but google to something like:

User-agent: Slurp
Crawl-delay: 120

This is for the yahoo bot but should also be applied to msnbot.

It's crazy how some of these bots love to hit your site at the same
time. A 120 second delay should be more than enough time between
hits even if they all come at the same time.

Cheers,

Mark


On Mar 10, 2005, at 10:33 AM, Jens Vagelpohl wrote:

>
> On Mar 10, 2005, at 2:18, Andrew Sawyers wrote:
>
>> It's a little of both; there's a group of people working on this - we
>> hope
>> to have something real soon now :) as a fix. Jens, could do you have
>> the
>> time to check the zope.org robots.txt? A lot of the problems I've
>> seen
>> recently were due to several robots spidering zope.org at a time. I'm
>> working on additional hardware and we should see more traction on the
>> project sooner then later.
>
> I don't believe all that much in robots.txt. The nasty bots completely
> ignore it, anyway. The only way to deal with them is to block them
> with e.g. iptables.
>
> What's currently there looks odd:
>
> """
> User-agent: wget
> Disallow: /
>
> User-agent: Wget
> Disallow: /
>
> # Ask Google to skip search queries and the like.
> User-agent: Googlebot
> Disallow: /*?
> """
>
> Looking at the spec the case sensitivity of the User-agent value is
> only "recommended", but you could shorten that into the following,
> because multiple User-agent values are allowed per rule set:
>
> """
> User-agent: wget
> User-agent: Wget
> Disallow: /
> """
>
> Otherwise there really isn't much in there, and from seeing googlebots
> myself often enough I have my doubts whether the line "Disallow: /*?"
> works at all.
>
> jens
>
> _______________________________________________
> Zope-web maillist - Zope-web@zope.org
> http://mail.zope.org/mailman/listinfo/zope-web
>
>

_______________________________________________
Zope-web maillist - Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web
Re: Zope.org currently unusable [ In reply to ]
--On Donnerstag, 10. März 2005 0:39 Uhr +0100 Jens Vagelpohl
<jens@dataflake.org> wrote:

>
> I would suggest you try again in the morning when there is less US-based
> traffic. I know, that's not a solution, that's just avoiding the symptoms.
>

But this advise actually worked :-/ ...hoping to see new.new.zope.org soon
:-)

Andreas
RE: Zope.org currently unusable [ In reply to ]
I need to read up on the robots.txt spec. Excellent Mark, thanks.
Andrew

--
Zope Managed Hosting
Software Engineer
Zope Corporation
(540) 361-1700

> -----Original Message-----
> From: zope-web-bounces@zope.org [mailto:zope-web-bounces@zope.org] On
> Behalf Of Mark Pratt
> Sent: Thursday, March 10, 2005 6:16 AM
> To: Jens Vagelpohl
> Cc: zope-web@zope.org
> Subject: Re: [ZWeb] Zope.org currently unusable
>
> Hi,
>
> I recommend adding crawl delays for all but google to something like:
>
> User-agent: Slurp
> Crawl-delay: 120
>
> This is for the yahoo bot but should also be applied to msnbot.
>
> It's crazy how some of these bots love to hit your site at the same
> time. A 120 second delay should be more than enough time between
> hits even if they all come at the same time.
>
> Cheers,
>
> Mark
>
>
> On Mar 10, 2005, at 10:33 AM, Jens Vagelpohl wrote:
>
> >
> > On Mar 10, 2005, at 2:18, Andrew Sawyers wrote:
> >
> >> It's a little of both; there's a group of people working on this - we
> >> hope
> >> to have something real soon now :) as a fix. Jens, could do you have
> >> the
> >> time to check the zope.org robots.txt? A lot of the problems I've
> >> seen
> >> recently were due to several robots spidering zope.org at a time. I'm
> >> working on additional hardware and we should see more traction on the
> >> project sooner then later.
> >
> > I don't believe all that much in robots.txt. The nasty bots completely
> > ignore it, anyway. The only way to deal with them is to block them
> > with e.g. iptables.
> >
> > What's currently there looks odd:
> >
> > """
> > User-agent: wget
> > Disallow: /
> >
> > User-agent: Wget
> > Disallow: /
> >
> > # Ask Google to skip search queries and the like.
> > User-agent: Googlebot
> > Disallow: /*?
> > """
> >
> > Looking at the spec the case sensitivity of the User-agent value is
> > only "recommended", but you could shorten that into the following,
> > because multiple User-agent values are allowed per rule set:
> >
> > """
> > User-agent: wget
> > User-agent: Wget
> > Disallow: /
> > """
> >
> > Otherwise there really isn't much in there, and from seeing googlebots
> > myself often enough I have my doubts whether the line "Disallow: /*?"
> > works at all.
> >
> > jens
> >
> > _______________________________________________
> > Zope-web maillist - Zope-web@zope.org
> > http://mail.zope.org/mailman/listinfo/zope-web
> >
> >
>
> _______________________________________________
> Zope-web maillist - Zope-web@zope.org
> http://mail.zope.org/mailman/listinfo/zope-web

_______________________________________________
Zope-web maillist - Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web
Re: Zope.org currently unusable [ In reply to ]
On Mar 10, 2005, at 15:27, Andrew Sawyers wrote:

> I need to read up on the robots.txt spec. Excellent Mark, thanks.
> Andrew

That piece is not part of the spec. Just like the wildcards that Google
claims they use (and I still don't believe that works as advertised).
This is the spec:

http://www.robotstxt.org/wc/norobots.html

Here is a robots.txt validator:

http://www.searchengineworld.com/cgi-bin/robotcheck.cgi

Here's a funny one: Some collected all the reckless/useless user agents
for exclusion:

http://www.searchenginegenie.com/Dangerous-user-agents.htm

This one explains Slurp-specific extensions:

http://help.yahoo.com/help/us/ysearch/slurp/slurp-03.html

jens

_______________________________________________
Zope-web maillist - Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web
Re: Zope.org currently unusable [ In reply to ]
Jens,

You are correct that the crawl-delay parameter is not part of the spec.
Their are plenty of examples where specs don't keep up with the times
and their is no harm done using that tag.
Worrying about a robots.txt file is a bit over the top :-)

Thanks for the link to the reckless/useless user agents page.

Cheers,

Mark

On Mar 10, 2005, at 3:39 PM, Jens Vagelpohl wrote:

>
> On Mar 10, 2005, at 15:27, Andrew Sawyers wrote:
>
>> I need to read up on the robots.txt spec. Excellent Mark, thanks.
>> Andrew
>
> That piece is not part of the spec. Just like the wildcards that
> Google claims they use (and I still don't believe that works as
> advertised). This is the spec:
>
> http://www.robotstxt.org/wc/norobots.html
>
> Here is a robots.txt validator:
>
> http://www.searchengineworld.com/cgi-bin/robotcheck.cgi
>
> Here's a funny one: Some collected all the reckless/useless user
> agents for exclusion:
>
> http://www.searchenginegenie.com/Dangerous-user-agents.htm
>
> This one explains Slurp-specific extensions:
>
> http://help.yahoo.com/help/us/ysearch/slurp/slurp-03.html
>
> jens
>
> _______________________________________________
> Zope-web maillist - Zope-web@zope.org
> http://mail.zope.org/mailman/listinfo/zope-web
>
>

_______________________________________________
Zope-web maillist - Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web
Re: Zope.org currently unusable [ In reply to ]
one brief correction

Meant to say:

"worrying about validating a robots.txt file is a bit over the top" :-)

Cheers,

Mark

On Mar 10, 2005, at 4:38 PM, Mark Pratt wrote:

> Jens,
>
> You are correct that the crawl-delay parameter is not part of the spec.
> Their are plenty of examples where specs don't keep up with the times
> and their is no harm done using that tag.
> Worrying about a robots.txt file is a bit over the top :-)
>
> Thanks for the link to the reckless/useless user agents page.
>
> Cheers,
>
> Mark
>
> On Mar 10, 2005, at 3:39 PM, Jens Vagelpohl wrote:
>
>>
>> On Mar 10, 2005, at 15:27, Andrew Sawyers wrote:
>>
>>> I need to read up on the robots.txt spec. Excellent Mark, thanks.
>>> Andrew
>>
>> That piece is not part of the spec. Just like the wildcards that
>> Google claims they use (and I still don't believe that works as
>> advertised). This is the spec:
>>
>> http://www.robotstxt.org/wc/norobots.html
>>
>> Here is a robots.txt validator:
>>
>> http://www.searchengineworld.com/cgi-bin/robotcheck.cgi
>>
>> Here's a funny one: Some collected all the reckless/useless user
>> agents for exclusion:
>>
>> http://www.searchenginegenie.com/Dangerous-user-agents.htm
>>
>> This one explains Slurp-specific extensions:
>>
>> http://help.yahoo.com/help/us/ysearch/slurp/slurp-03.html
>>
>> jens
>>
>> _______________________________________________
>> Zope-web maillist - Zope-web@zope.org
>> http://mail.zope.org/mailman/listinfo/zope-web
>>
>>
>
> _______________________________________________
> Zope-web maillist - Zope-web@zope.org
> http://mail.zope.org/mailman/listinfo/zope-web
>
>

_______________________________________________
Zope-web maillist - Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web