Mailing List Archive

JunOS on EX-4200 stack corruption
Has anyone seen an issues where, once a stack looses power (all at once,
sadly), some members of the stack come up with corrupt JunOS on them?

We have a stack of 6 EX4200's. Due to a power issue, all 6 lost power at
the same time. When power was restored, the Master, the Backup and one
linecard mode switch came up fine. The 3 other linecard mode switches
didn't. On powerup they all reported shared libaries not being found and
similar "file not found" type errors.

Reinstalling JunOS from the boot prompt has fixed the problem.

Has anyone else encountered corrupt JunOS problems on EX4200's,
specifically in a stack configuration, before?

Thanks!

Tim

_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: JunOS on EX-4200 stack corruption [ In reply to ]
On Thu, Oct 07, 2010 at 12:09:16PM +1300, TiM wrote:
> Has anyone seen an issues where, once a stack looses power (all at once,
> sadly), some members of the stack come up with corrupt JunOS on them?

Yes, I've seem this happen. The moral of the story is make sure you
have good UPS battery backup and redundant power supplies.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: JunOS on EX-4200 stack corruption [ In reply to ]
On Thu, October 7, 2010 12:50 pm, Chuck Anderson wrote:
> On Thu, Oct 07, 2010 at 12:09:16PM +1300, TiM wrote:
>> Has anyone seen an issues where, once a stack looses power (all at once,
>> sadly), some members of the stack come up with corrupt JunOS on them?
>
> Yes, I've seem this happen. The moral of the story is make sure you
> have good UPS battery backup and redundant power supplies.

Yes, we had all that. Don't ask :)

Do Juniper know about, do you think it's worth logging a TAC case?

Tim


_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: JunOS on EX-4200 stack corruption [ In reply to ]
I've personally never seen this. Maybe the power issues affected the flash
inside the switch? I doubt the same would happen if you pulled the power
cords from all switches at once for example.

On Wed, Oct 6, 2010 at 7:09 PM, TiM <tim@muppetz.com> wrote:

> Has anyone seen an issues where, once a stack looses power (all at once,
> sadly), some members of the stack come up with corrupt JunOS on them?
>
> We have a stack of 6 EX4200's. Due to a power issue, all 6 lost power at
> the same time. When power was restored, the Master, the Backup and one
> linecard mode switch came up fine. The 3 other linecard mode switches
> didn't. On powerup they all reported shared libaries not being found and
> similar "file not found" type errors.
>
> Reinstalling JunOS from the boot prompt has fixed the problem.
>
> Has anyone else encountered corrupt JunOS problems on EX4200's,
> specifically in a stack configuration, before?
>
> Thanks!
>
> Tim
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
>
>
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: JunOS on EX-4200 stack corruption [ In reply to ]
Hi Tim,

On Thu, Oct 7, 2010 at 10:09 AM, TiM <tim@muppetz.com> wrote:
> Has anyone seen an issues where, once a stack looses power (all at once,
> sadly), some members of the stack come up with corrupt JunOS on them?
>
> We have a stack of 6 EX4200's.  Due to a power issue, all 6 lost power at
> the same time.  When power was restored, the Master, the Backup and one
> linecard mode switch came up fine.  The 3 other linecard mode switches
> didn't.  On powerup they all reported shared libaries not being found and
> similar "file not found" type errors.
>
> Reinstalling JunOS from the boot prompt has fixed the problem.
>
> Has anyone else encountered corrupt JunOS problems on EX4200's,
> specifically in a stack configuration, before?

Yes, it's been a serious, ongoing problem for us. We have ~1200
switches across ~360 VCs. We're running JUNOS 10.0S1.

There is a PR for the problem - it's PR/543776. There is no permanent
fix as yet but there is a workaround available in 10.0R4, 10.1R4,
10.2R3, 10.3R2, 10.4R1 or later. It's called 'Automatic File System
Mirroring and Automatic Recovery' and at this stage you'll need to
talk to JTAC about it.

cheers,
Dale

_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: JunOS on EX-4200 stack corruption [ In reply to ]
On Thu, October 7, 2010 1:00 pm, Dale Shaw wrote:
> Hi Tim,
>
> On Thu, Oct 7, 2010 at 10:09 AM, TiM <tim@muppetz.com> wrote:
>> Has anyone seen an issues where, once a stack looses power (all at once,
>> sadly), some members of the stack come up with corrupt JunOS on them?
>>
>> We have a stack of 6 EX4200's.  Due to a power issue, all 6 lost power
>> at
>> the same time.  When power was restored, the Master, the Backup and one
>> linecard mode switch came up fine.  The 3 other linecard mode switches
>> didn't.  On powerup they all reported shared libaries not being found
>> and
>> similar "file not found" type errors.
>>
>> Reinstalling JunOS from the boot prompt has fixed the problem.
>>
>> Has anyone else encountered corrupt JunOS problems on EX4200's,
>> specifically in a stack configuration, before?
>
> Yes, it's been a serious, ongoing problem for us. We have ~1200
> switches across ~360 VCs. We're running JUNOS 10.0S1.
>
> There is a PR for the problem - it's PR/543776. There is no permanent
> fix as yet but there is a workaround available in 10.0R4, 10.1R4,
> 10.2R3, 10.3R2, 10.4R1 or later. It's called 'Automatic File System
> Mirroring and Automatic Recovery' and at this stage you'll need to
> talk to JTAC about it.
>
> cheers,
> Dale

Thanks very much Dale and thanks everyone else who replied.

Looks like we are, sadly, not alone.

Regards,

Tim

_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: JunOS on EX-4200 stack corruption [ In reply to ]
Experienced this issue with a customer as well. JTAC recommend the process that Dale discussed in the previous post.

Ralph

-----Original Message-----
From: juniper-nsp-bounces@puck.nether.net [mailto:juniper-nsp-bounces@puck.nether.net] On Behalf Of Dale Shaw
Sent: Wednesday, October 06, 2010 7:01 PM
To: tim@muppetz.com
Cc: juniper-nsp@puck.nether.net
Subject: Re: [j-nsp] JunOS on EX-4200 stack corruption

Hi Tim,

On Thu, Oct 7, 2010 at 10:09 AM, TiM <tim@muppetz.com> wrote:
> Has anyone seen an issues where, once a stack looses power (all at once,
> sadly), some members of the stack come up with corrupt JunOS on them?
>
> We have a stack of 6 EX4200's.  Due to a power issue, all 6 lost power at
> the same time.  When power was restored, the Master, the Backup and one
> linecard mode switch came up fine.  The 3 other linecard mode switches
> didn't.  On powerup they all reported shared libaries not being found and
> similar "file not found" type errors.
>
> Reinstalling JunOS from the boot prompt has fixed the problem.
>
> Has anyone else encountered corrupt JunOS problems on EX4200's,
> specifically in a stack configuration, before?

Yes, it's been a serious, ongoing problem for us. We have ~1200
switches across ~360 VCs. We're running JUNOS 10.0S1.

There is a PR for the problem - it's PR/543776. There is no permanent
fix as yet but there is a workaround available in 10.0R4, 10.1R4,
10.2R3, 10.3R2, 10.4R1 or later. It's called 'Automatic File System
Mirroring and Automatic Recovery' and at this stage you'll need to
talk to JTAC about it.

cheers,
Dale

_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: JunOS on EX-4200 stack corruption [ In reply to ]
On Thu, Oct 07, 2010 at 12:55:17PM +1300, TiM wrote:
>
> On Thu, October 7, 2010 12:50 pm, Chuck Anderson wrote:
> > On Thu, Oct 07, 2010 at 12:09:16PM +1300, TiM wrote:
> >> Has anyone seen an issues where, once a stack looses power (all at once,
> >> sadly), some members of the stack come up with corrupt JunOS on them?
> >
> > Yes, I've seem this happen. The moral of the story is make sure you
> > have good UPS battery backup and redundant power supplies.
>
> Yes, we had all that. Don't ask :)
>
> Do Juniper know about, do you think it's worth logging a TAC case?

Yes, please do ask them why the BSD filesystem it uses doesn't have
soft updates/softdep or journaling enabled. Oh look here, Juniper was
one of the sponsors to write journaling support for FreeBSD's UFS:

http://www.osnews.com/story/23205/FreeBSD_UFS_with_Softupdates_Journaling

So maybe we'll see it in JunOS eventually.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp