Mailing List Archive

False unreachable?
Hi all,
I've have rancid running on 300+ devices. I have a few however claim that they are unreachable. I get the:

"The following routers have not been successfully contacted for more
than 4 hours"

email on every polling cycle. The logs state:

"my.device clogin error: Error: Connection Refused (telnet)"

However, I can clogin just fine from the command line.

Any ideas where I can being investigating this further?

Thanks,
~~ted
False unreachable? [ In reply to ]
On Sat, 31 May 2003, Ted Bedwell wrote:

> Hi all,
> I've have rancid running on 300+ devices. I have a few however claim that they are unreachable. I get the:
>
> "The following routers have not been successfully contacted for more
> than 4 hours"
>
> email on every polling cycle. The logs state:
>
> "my.device clogin error: Error: Connection Refused (telnet)"

Is it possible at the instant rancid failed, the device was out of vty
interfaces?

----------------------------------------------------------------------
Jon Lewis *jlewis at lewis.org*| I route
System Administrator | therefore you are
Atlantic Net |
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
False unreachable? [ In reply to ]
I'm fairly sure this is not the problem. It happens at every polling
interval (2 hours) without exception, and always works from the CLI. If it
were a vty issue, you would expect it to occasionally go through. Any other
ideas?

Thanks,
~~ted


----- Original Message -----
From: <jlewis@lewis.org>
To: "Ted Bedwell" <ted at cw.net>
Cc: <rancid-discuss at shrubbery.net>
Sent: Saturday, May 31, 2003 10:07 PM
Subject: Re: False unreachable?


> On Sat, 31 May 2003, Ted Bedwell wrote:
>
> > Hi all,
> > I've have rancid running on 300+ devices. I have a few however claim
that they are unreachable. I get the:
> >
> > "The following routers have not been successfully contacted for more
> > than 4 hours"
> >
> > email on every polling cycle. The logs state:
> >
> > "my.device clogin error: Error: Connection Refused (telnet)"
>
> Is it possible at the instant rancid failed, the device was out of vty
> interfaces?
>
> ----------------------------------------------------------------------
> Jon Lewis *jlewis at lewis.org*| I route
> System Administrator | therefore you are
> Atlantic Net |
> _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
>
False unreachable? [ In reply to ]
Hi Ted,

Can you replicate with, or show us the contents of...

export NOPIPE=1
rancid -d my.device

You'll have a file called 'my.device.new' in the cwd afterwards, which may shed some more light.

-afort

>Subject: Re: False unreachable?
> From: "Ted Bedwell" <ted at cw.net>
> Date: Mon, 2 Jun 2003 12:05:49 -0400
> To: <rancid-discuss at shrubbery.net>
>
>I'm fairly sure this is not the problem. It happens at every polling
>interval (2 hours) without exception, and always works from the CLI. If it
>were a vty issue, you would expect it to occasionally go through. Any other
>ideas?
>
>Thanks,
>~~ted
>
>
>----- Original Message -----
>From: <jlewis at lewis.org>
>To: "Ted Bedwell" <ted at cw.net>
>Cc: <rancid-discuss at shrubbery.net>
>Sent: Saturday, May 31, 2003 10:07 PM
>Subject: Re: False unreachable?
>
>
>> On Sat, 31 May 2003, Ted Bedwell wrote:
>>
>> > Hi all,
>> > I've have rancid running on 300+ devices. I have a few however claim
>that they are unreachable. I get the:
>> >
>> > "The following routers have not been successfully contacted for more
>> > than 4 hours"
>> >
>> > email on every polling cycle. The logs state:
>> >
>> > "my.device clogin error: Error: Connection Refused (telnet)"
>>
>> Is it possible at the instant rancid failed, the device was out of vty
>> interfaces?
>>
>> ----------------------------------------------------------------------
>> Jon Lewis *jlewis at lewis.org*| I route
>> System Administrator | therefore you are
>> Atlantic Net |
>> _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
>>
False unreachable? [ In reply to ]
That worked beautifully. my.devce.new has the config in it. However, it
still is not working when the do-diffs runs.

~~ted


----- Original Message -----
From: "Andrew Fort" <afort@choqolat.org>
To: <ted at cw.net>; <rancid-discuss at shrubbery.net>
Sent: Monday, June 02, 2003 8:29 PM
Subject: Re: False unreachable?


> Hi Ted,
>
> Can you replicate with, or show us the contents of...
>
> export NOPIPE=1
> rancid -d my.device
>
> You'll have a file called 'my.device.new' in the cwd afterwards, which may
shed some more light.
>
> -afort
>
> >Subject: Re: False unreachable?
> > From: "Ted Bedwell" <ted at cw.net>
> > Date: Mon, 2 Jun 2003 12:05:49 -0400
> > To: <rancid-discuss at shrubbery.net>
> >
> >I'm fairly sure this is not the problem. It happens at every polling
> >interval (2 hours) without exception, and always works from the CLI. If
it
> >were a vty issue, you would expect it to occasionally go through. Any
other
> >ideas?
> >
> >Thanks,
> >~~ted
> >
> >
> >----- Original Message -----
> >From: <jlewis at lewis.org>
> >To: "Ted Bedwell" <ted at cw.net>
> >Cc: <rancid-discuss at shrubbery.net>
> >Sent: Saturday, May 31, 2003 10:07 PM
> >Subject: Re: False unreachable?
> >
> >
> >> On Sat, 31 May 2003, Ted Bedwell wrote:
> >>
> >> > Hi all,
> >> > I've have rancid running on 300+ devices. I have a few however
claim
> >that they are unreachable. I get the:
> >> >
> >> > "The following routers have not been successfully contacted for more
> >> > than 4 hours"
> >> >
> >> > email on every polling cycle. The logs state:
> >> >
> >> > "my.device clogin error: Error: Connection Refused (telnet)"
> >>
> >> Is it possible at the instant rancid failed, the device was out of vty
> >> interfaces?
> >>
> >> ----------------------------------------------------------------------
> >> Jon Lewis *jlewis at lewis.org*| I route
> >> System Administrator | therefore you are
> >> Atlantic Net |
> >> _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
> >>
>
False unreachable? [ In reply to ]
doesnt make sense. just to review the obvious...
- you tried it manually as the user who runs rancid
- your manual attempt used telnet and not ssh
- your path and/or environment matches that of the rancid user
(no goofy telnet executable or kerberos ...)

otherwise, i would disable the do-diffs cron job, edit bin/rancid-fe
and place -d after the device's rancid script, eg:

< elsif ($vendor =~ /^cisco$/i) { exec('rancid', $router); }
> elsif ($vendor =~ /^cisco$/i) { exec('rancid', '-d', $router); }

and set NOPIPE=YES in bin/env, then run bin/do-diffs <group name>

this way, the .raw file will remain behind and can be further examined.

warning: this will cause zero-ing/truncation of any configs that fail to
be collected. of course, it will be corrected on the next successful run.

Tue, Jun 03, 2003 at 04:01:13PM -0400, Ted Bedwell:
> That worked beautifully. my.devce.new has the config in it. However, it
> still is not working when the do-diffs runs.
>
> ~~ted
False unreachable? [ In reply to ]
That did it.

One KEY piece of info I neglected to include in all these discussions is
that I am running v 2.1. I apologize for not providing this info. I will
endeavor to upgrade in the near future. For the time being, running in debug
mode will have to suffice. I'll post a follow-up if the upgrade fixes the
problem.

Thanks again to everyone who contributed.

~~ted

----- Original Message -----
From: "john heasley" <heas@shrubbery.net>
To: "Ted Bedwell" <ted at cw.net>
Cc: <rancid-discuss at shrubbery.net>
Sent: Tuesday, June 03, 2003 4:43 PM
Subject: Re: False unreachable?


> doesnt make sense. just to review the obvious...
> - you tried it manually as the user who runs rancid
> - your manual attempt used telnet and not ssh
> - your path and/or environment matches that of the rancid user
> (no goofy telnet executable or kerberos ...)
>
> otherwise, i would disable the do-diffs cron job, edit bin/rancid-fe
> and place -d after the device's rancid script, eg:
>
> < elsif ($vendor =~ /^cisco$/i) { exec('rancid', $router); }
> > elsif ($vendor =~ /^cisco$/i) { exec('rancid', '-d',
$router); }
>
> and set NOPIPE=YES in bin/env, then run bin/do-diffs <group name>
>
> this way, the .raw file will remain behind and can be further examined.
>
> warning: this will cause zero-ing/truncation of any configs that fail to
> be collected. of course, it will be corrected on the next successful run.
>
> Tue, Jun 03, 2003 at 04:01:13PM -0400, Ted Bedwell:
> > That worked beautifully. my.devce.new has the config in it. However, it
> > still is not working when the do-diffs runs.
> >
> > ~~ted
>