Mailing List Archive

Failed Aggregate 0
I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
failure in one of our raid groups and looking for help recovering. The unit
is out of warranty and support. Any help is appreciated.



-----
Christopher Chandler
CEO
DH Innovations, LLC
--
Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
Re: Failed Aggregate 0 [ In reply to ]
Christopher,

Is it 7-mode? Can you send the output of 'aggr status -r'?

Thank you,
Tim

From: chris@dhinnovations.com
Sent: August 19, 2018 3:14 PM
To: toasters@teaparty.net
Subject: Failed Aggregate 0


I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
failure in one of our raid groups and looking for help recovering. The unit
is out of warranty and support. Any help is appreciated.



-----
Christopher Chandler
CEO
DH Innovations, LLC
--
Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
Re: Failed Aggregate 0 [ In reply to ]
You did not include many details.
RAID 4 or RAID-DP? Sounds like you are using RAID4 if you have a double
disk failure and you lost an aggregate.
--tmac

*Tim McCarthy, **Principal Consultant*

*Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*

*I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*



On Sun, Aug 19, 2018 at 6:21 PM dhichandler <chris@dhinnovations.com> wrote:

> I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
> failure in one of our raid groups and looking for help recovering. The unit
> is out of warranty and support. Any help is appreciated.
>
>
>
> -----
> Christopher Chandler
> CEO
> DH Innovations, LLC
> --
> Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters
>
Re: Failed Aggregate 0 [ In reply to ]
Aggr State Status Options
aggr0 online raid_dp, aggr root, diskroot,
nosnap=off, raidtype=raid_dp,
degraded raidsize=16,
raid_lost_write=on,
64-bit
ignore_inconsistent=off, snapmirrored=off,
rlw_on resyncsnaptime=60,
fs_size_fixed=off,
lost_write_protect=on,
no_delete_log=off,
ha_policy=cfo,
hybrid_enabled=off,

percent_snapshot_space=0%,

free_space_realloc=off, raid_cv=on,
thorough_scrub=off

Volumes: dcs_ds_01, dcs_iso, vol0

Plex /aggr0/plex0: online, normal, active
RAID group /aggr0/plex0/rg0: double degraded, block
checksums
RAID group /aggr0/plex0/rg1: double degraded, block
checksums
RAID group /aggr0/plex0/rg2: normal, block checksums



-----
Christopher Chandler
CEO
DH Innovations, LLC
--
Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
Re: Failed Aggregate 0 [ In reply to ]
Christoper,

aggr0 is still online just double degraded. Looks like you just need to put some spares in there and let it rebuild.

Does the partner have any spares? (Run 'aggr status -s' on the partner)

Or perhaps do you have any spares around or unassigned disks?

Tim

From: chris@dhinnovations.com
Sent: August 19, 2018 3:23 PM
To: tmacmd@gmail.com
Cc: toasters@teaparty.net
Subject: Re: Failed Aggregate 0



Aggr State Status Options
aggr0 online raid_dp, aggr root, diskroot, nosnap=off, raidtype=raid_dp,
degraded raidsize=16, raid_lost_write=on,
64-bit ignore_inconsistent=off, snapmirrored=off,
rlw_on resyncsnaptime=60, fs_size_fixed=off,
lost_write_protect=on, no_delete_log=off,
ha_policy=cfo, hybrid_enabled=off,
percent_snapshot_space=0%,
free_space_realloc=off, raid_cv=on,
thorough_scrub=off
Volumes: dcs_ds_01, dcs_iso, vol0
Plex /aggr0/plex0: online, normal, active
RAID group /aggr0/plex0/rg0: double degraded, block checksums
RAID group /aggr0/plex0/rg1: double degraded, block checksums
RAID group /aggr0/plex0/rg2: normal, block checksums


________________________________
From: tmac <tmacmd@gmail.com>
Sent: Sunday, August 19, 2018 6:23:54 PM
To: Christopher D. Chandler
Cc: Toasters
Subject: Re: Failed Aggregate 0

You did not include many details.
RAID 4 or RAID-DP? Sounds like you are using RAID4 if you have a double disk failure and you lost an aggregate.
--tmac

Tim McCarthy, Principal Consultant

Proud Member of the #NetAppATeam<https://twitter.com/NetAppATeam>

I Blog at TMACsRack<https://tmacsrack.wordpress.com/>



On Sun, Aug 19, 2018 at 6:21 PM dhichandler <chris@dhinnovations.com<mailto:chris@dhinnovations.com>> wrote:
I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
failure in one of our raid groups and looking for help recovering. The unit
is out of warranty and support. Any help is appreciated.



-----
Christopher Chandler
CEO
DH Innovations, LLC
--
Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters
Re: Failed Aggregate 0 [ In reply to ]
Aggr State Status Options
aggr0 online raid_dp, aggr root, diskroot,
nosnap=off, raidtype=raid_dp,
degraded raidsize=16,
raid_lost_write=on,
64-bit
ignore_inconsistent=off, snapmirrored=off,
rlw_on resyncsnaptime=60,
fs_size_fixed=off,
lost_write_protect=on,
no_delete_log=off,
ha_policy=cfo,
hybrid_enabled=off,

percent_snapshot_space=0%,

free_space_realloc=off, raid_cv=on,
thorough_scrub=off

Volumes: dcs_ds_01, dcs_iso, vol0

Plex /aggr0/plex0: online, normal, active
RAID group /aggr0/plex0/rg0: double degraded, block
checksums
RAID group /aggr0/plex0/rg1: double degraded, block
checksums
RAID group /aggr0/plex0/rg2: normal, block checksums



-----
Christopher Chandler
CEO
DH Innovations, LLC
--
Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
Re: Failed Aggregate 0 [ In reply to ]
Aggr State Status Options
aggr0 online raid_dp, aggr root, diskroot, nosnap=off, raidtype=raid_dp,
degraded raidsize=16, raid_lost_write=on,
64-bit ignore_inconsistent=off, snapmirrored=off,
rlw_on resyncsnaptime=60, fs_size_fixed=off,
lost_write_protect=on, no_delete_log=off,
ha_policy=cfo, hybrid_enabled=off,
percent_snapshot_space=0%,
free_space_realloc=off, raid_cv=on,
thorough_scrub=off
Volumes: dcs_ds_01, dcs_iso, vol0
Plex /aggr0/plex0: online, normal, active
RAID group /aggr0/plex0/rg0: double degraded, block checksums
RAID group /aggr0/plex0/rg1: double degraded, block checksums
RAID group /aggr0/plex0/rg2: normal, block checksums


________________________________
From: tmac <tmacmd@gmail.com>
Sent: Sunday, August 19, 2018 6:23:54 PM
To: Christopher D. Chandler
Cc: Toasters
Subject: Re: Failed Aggregate 0

You did not include many details.
RAID 4 or RAID-DP? Sounds like you are using RAID4 if you have a double disk failure and you lost an aggregate.
--tmac

Tim McCarthy, Principal Consultant

Proud Member of the #NetAppATeam<https://twitter.com/NetAppATeam>

I Blog at TMACsRack<https://tmacsrack.wordpress.com/>



On Sun, Aug 19, 2018 at 6:21 PM dhichandler <chris@dhinnovations.com<mailto:chris@dhinnovations.com>> wrote:
I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
failure in one of our raid groups and looking for help recovering. The unit
is out of warranty and support. Any help is appreciated.



-----
Christopher Chandler
CEO
DH Innovations, LLC
--
Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters
Re: Failed Aggregate 0 [ In reply to ]
Here is the latest Autosupport before the unit faulted.


Chris

919-274-7684

________________________________
From: tmac <tmacmd@gmail.com>
Sent: Sunday, August 19, 2018 6:23:54 PM
To: Christopher D. Chandler
Cc: Toasters
Subject: Re: Failed Aggregate 0

You did not include many details.
RAID 4 or RAID-DP? Sounds like you are using RAID4 if you have a double disk failure and you lost an aggregate.
--tmac

Tim McCarthy, Principal Consultant

Proud Member of the #NetAppATeam<https://twitter.com/NetAppATeam>

I Blog at TMACsRack<https://tmacsrack.wordpress.com/>



On Sun, Aug 19, 2018 at 6:21 PM dhichandler <chris@dhinnovations.com<mailto:chris@dhinnovations.com>> wrote:
I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
failure in one of our raid groups and looking for help recovering. The unit
is out of warranty and support. Any help is appreciated.



-----
Christopher Chandler
CEO
DH Innovations, LLC
--
Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters
Re: Failed Aggregate 0 [ In reply to ]
That aggr is online and degraded. You need to get yourself a replacement
drive
--tmac

*Tim McCarthy, **Principal Consultant*

*Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*

*I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*



On Sun, Aug 19, 2018 at 6:26 PM Christopher D. Chandler <
chris@dhinnovations.com> wrote:

> Aggr State Status Options
> aggr0 online raid_dp, aggr root, diskroot,
> nosnap=off, raidtype=raid_dp,
> degraded raidsize=16,
> raid_lost_write=on,
> 64-bit
> ignore_inconsistent=off, snapmirrored=off,
> rlw_on resyncsnaptime=60,
> fs_size_fixed=off,
>
> lost_write_protect=on, no_delete_log=off,
> ha_policy=cfo,
> hybrid_enabled=off,
>
> percent_snapshot_space=0%,
>
> free_space_realloc=off, raid_cv=on,
> thorough_scrub=off
> Volumes: dcs_ds_01, dcs_iso, vol0
> Plex /aggr0/plex0: online, normal, active
> RAID group /aggr0/plex0/rg0: double degraded, block
> checksums
> RAID group /aggr0/plex0/rg1: double degraded, block
> checksums
> RAID group /aggr0/plex0/rg2: normal, block checksums
>
> ------------------------------
> *From:* tmac <tmacmd@gmail.com>
> *Sent:* Sunday, August 19, 2018 6:23:54 PM
> *To:* Christopher D. Chandler
> *Cc:* Toasters
> *Subject:* Re: Failed Aggregate 0
>
> You did not include many details.
> RAID 4 or RAID-DP? Sounds like you are using RAID4 if you have a double
> disk failure and you lost an aggregate.
> --tmac
>
> *Tim McCarthy, **Principal Consultant*
>
> *Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*
>
> *I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*
>
>
>
> On Sun, Aug 19, 2018 at 6:21 PM dhichandler <chris@dhinnovations.com>
> wrote:
>
>> I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
>> failure in one of our raid groups and looking for help recovering. The
>> unit
>> is out of warranty and support. Any help is appreciated.
>>
>>
>>
>> -----
>> Christopher Chandler
>> CEO
>> DH Innovations, LLC
>> --
>> Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
>> _______________________________________________
>> Toasters mailing list
>> Toasters@teaparty.net
>> http://www.teaparty.net/mailman/listinfo/toasters
>>
>
Re: Failed Aggregate 0 [ In reply to ]
Chris,

It may have shutdown due to raid timeout if you did not replace the disks.

Just boot it back up and reassign the spares.

Tim

From: chris@dhinnovations.com
Sent: August 19, 2018 3:27 PM
To: tnaple@BERKCOM.com
Cc: tmacmd@gmail.com; toasters@teaparty.net
Subject: Re: Failed Aggregate 0


The partner has six spares but the cluster is offline and the unit is sitting at the CFE prompt.

Chris Chandler
919-274-7684

On Aug 19, 2018, at 18:31, Timothy Naple <tnaple@BERKCOM.com<mailto:tnaple@BERKCOM.com>> wrote:

Christoper,

aggr0 is still online just double degraded. Looks like you just need to put some spares in there and let it rebuild.

Does the partner have any spares? (Run 'aggr status -s' on the partner)

Or perhaps do you have any spares around or unassigned disks?

Tim

From: chris@dhinnovations.com<mailto:chris@dhinnovations.com>
Sent: August 19, 2018 3:23 PM
To: tmacmd@gmail.com<mailto:tmacmd@gmail.com>
Cc: toasters@teaparty.net<mailto:toasters@teaparty.net>
Subject: Re: Failed Aggregate 0



Aggr State Status Options
aggr0 online raid_dp, aggr root, diskroot, nosnap=off, raidtype=raid_dp,
degraded raidsize=16, raid_lost_write=on,
64-bit ignore_inconsistent=off, snapmirrored=off,
rlw_on resyncsnaptime=60, fs_size_fixed=off,
lost_write_protect=on, no_delete_log=off,
ha_policy=cfo, hybrid_enabled=off,
percent_snapshot_space=0%,
free_space_realloc=off, raid_cv=on,
thorough_scrub=off
Volumes: dcs_ds_01, dcs_iso, vol0
Plex /aggr0/plex0: online, normal, active
RAID group /aggr0/plex0/rg0: double degraded, block checksums
RAID group /aggr0/plex0/rg1: double degraded, block checksums
RAID group /aggr0/plex0/rg2: normal, block checksums


________________________________
From: tmac <tmacmd@gmail.com<mailto:tmacmd@gmail.com>>
Sent: Sunday, August 19, 2018 6:23:54 PM
To: Christopher D. Chandler
Cc: Toasters
Subject: Re: Failed Aggregate 0

You did not include many details.
RAID 4 or RAID-DP? Sounds like you are using RAID4 if you have a double disk failure and you lost an aggregate.
--tmac

Tim McCarthy, Principal Consultant

Proud Member of the #NetAppATeam<https://twitter.com/NetAppATeam>

I Blog at TMACsRack<https://tmacsrack.wordpress.com/>



On Sun, Aug 19, 2018 at 6:21 PM dhichandler <chris@dhinnovations.com<mailto:chris@dhinnovations.com>> wrote:
I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
failure in one of our raid groups and looking for help recovering. The unit
is out of warranty and support. Any help is appreciated.



-----
Christopher Chandler
CEO
DH Innovations, LLC
--
Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters
Re: Failed Aggregate 0 [ In reply to ]
Ah maybe another disk failed after the aggr output you sent earlier. Boot then ctrl c for maintenance option 5 and then run aggr status -r.

Tim

From: chris@dhinnovations.com
Sent: August 19, 2018 3:30 PM
To: tnaple@BERKCOM.com
Cc: tmacmd@gmail.com; toasters@teaparty.net
Subject: Re: Failed Aggregate 0


When I try to boot it panics and says it has no root volume.

Chris Chandler
919-274-7684

On Aug 19, 2018, at 18:34, Timothy Naple <tnaple@BERKCOM.com<mailto:tnaple@BERKCOM.com>> wrote:

Chris,

It may have shutdown due to raid timeout if you did not replace the disks.

Just boot it back up and reassign the spares.

Tim

From: chris@dhinnovations.com<mailto:chris@dhinnovations.com>
Sent: August 19, 2018 3:27 PM
To: tnaple@BERKCOM.com<mailto:tnaple@BERKCOM.com>
Cc: tmacmd@gmail.com<mailto:tmacmd@gmail.com>; toasters@teaparty.net<mailto:toasters@teaparty.net>
Subject: Re: Failed Aggregate 0


The partner has six spares but the cluster is offline and the unit is sitting at the CFE prompt.

Chris Chandler
919-274-7684

On Aug 19, 2018, at 18:31, Timothy Naple <tnaple@BERKCOM.com<mailto:tnaple@BERKCOM.com>> wrote:

Christoper,

aggr0 is still online just double degraded. Looks like you just need to put some spares in there and let it rebuild.

Does the partner have any spares? (Run 'aggr status -s' on the partner)

Or perhaps do you have any spares around or unassigned disks?

Tim

From: chris@dhinnovations.com<mailto:chris@dhinnovations.com>
Sent: August 19, 2018 3:23 PM
To: tmacmd@gmail.com<mailto:tmacmd@gmail.com>
Cc: toasters@teaparty.net<mailto:toasters@teaparty.net>
Subject: Re: Failed Aggregate 0



Aggr State Status Options
aggr0 online raid_dp, aggr root, diskroot, nosnap=off, raidtype=raid_dp,
degraded raidsize=16, raid_lost_write=on,
64-bit ignore_inconsistent=off, snapmirrored=off,
rlw_on resyncsnaptime=60, fs_size_fixed=off,
lost_write_protect=on, no_delete_log=off,
ha_policy=cfo, hybrid_enabled=off,
percent_snapshot_space=0%,
free_space_realloc=off, raid_cv=on,
thorough_scrub=off
Volumes: dcs_ds_01, dcs_iso, vol0
Plex /aggr0/plex0: online, normal, active
RAID group /aggr0/plex0/rg0: double degraded, block checksums
RAID group /aggr0/plex0/rg1: double degraded, block checksums
RAID group /aggr0/plex0/rg2: normal, block checksums


________________________________
From: tmac <tmacmd@gmail.com<mailto:tmacmd@gmail.com>>
Sent: Sunday, August 19, 2018 6:23:54 PM
To: Christopher D. Chandler
Cc: Toasters
Subject: Re: Failed Aggregate 0

You did not include many details.
RAID 4 or RAID-DP? Sounds like you are using RAID4 if you have a double disk failure and you lost an aggregate.
--tmac

Tim McCarthy, Principal Consultant

Proud Member of the #NetAppATeam<https://twitter.com/NetAppATeam>

I Blog at TMACsRack<https://tmacsrack.wordpress.com/>



On Sun, Aug 19, 2018 at 6:21 PM dhichandler <chris@dhinnovations.com<mailto:chris@dhinnovations.com>> wrote:
I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
failure in one of our raid groups and looking for help recovering. The unit
is out of warranty and support. Any help is appreciated.



-----
Christopher Chandler
CEO
DH Innovations, LLC
--
Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters
Re: Failed Aggregate 0 [ In reply to ]
The partner has six spares but the cluster is offline and the unit is sitting at the CFE prompt.

Chris Chandler
919-274-7684

On Aug 19, 2018, at 18:31, Timothy Naple <tnaple@BERKCOM.com<mailto:tnaple@BERKCOM.com>> wrote:

Christoper,

aggr0 is still online just double degraded. Looks like you just need to put some spares in there and let it rebuild.

Does the partner have any spares? (Run 'aggr status -s' on the partner)

Or perhaps do you have any spares around or unassigned disks?

Tim

From: chris@dhinnovations.com<mailto:chris@dhinnovations.com>
Sent: August 19, 2018 3:23 PM
To: tmacmd@gmail.com<mailto:tmacmd@gmail.com>
Cc: toasters@teaparty.net<mailto:toasters@teaparty.net>
Subject: Re: Failed Aggregate 0



Aggr State Status Options
aggr0 online raid_dp, aggr root, diskroot, nosnap=off, raidtype=raid_dp,
degraded raidsize=16, raid_lost_write=on,
64-bit ignore_inconsistent=off, snapmirrored=off,
rlw_on resyncsnaptime=60, fs_size_fixed=off,
lost_write_protect=on, no_delete_log=off,
ha_policy=cfo, hybrid_enabled=off,
percent_snapshot_space=0%,
free_space_realloc=off, raid_cv=on,
thorough_scrub=off
Volumes: dcs_ds_01, dcs_iso, vol0
Plex /aggr0/plex0: online, normal, active
RAID group /aggr0/plex0/rg0: double degraded, block checksums
RAID group /aggr0/plex0/rg1: double degraded, block checksums
RAID group /aggr0/plex0/rg2: normal, block checksums


________________________________
From: tmac <tmacmd@gmail.com<mailto:tmacmd@gmail.com>>
Sent: Sunday, August 19, 2018 6:23:54 PM
To: Christopher D. Chandler
Cc: Toasters
Subject: Re: Failed Aggregate 0

You did not include many details.
RAID 4 or RAID-DP? Sounds like you are using RAID4 if you have a double disk failure and you lost an aggregate.
--tmac

Tim McCarthy, Principal Consultant

Proud Member of the #NetAppATeam<https://twitter.com/NetAppATeam>

I Blog at TMACsRack<https://tmacsrack.wordpress.com/>



On Sun, Aug 19, 2018 at 6:21 PM dhichandler <chris@dhinnovations.com<mailto:chris@dhinnovations.com>> wrote:
I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
failure in one of our raid groups and looking for help recovering. The unit
is out of warranty and support. Any help is appreciated.



-----
Christopher Chandler
CEO
DH Innovations, LLC
--
Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters
Re: Failed Aggregate 0 [ In reply to ]
Try sending us output of

"disk show -a"

--tmac

*Tim McCarthy, **Principal Consultant*

*Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*

*I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*


On Sun, Aug 19, 2018 at 6:32 PM Christopher D. Chandler <
chris@dhinnovations.com> wrote:

> The partner has six spares but the cluster is offline and the unit is
> sitting at the CFE prompt.
>
> Chris Chandler
> 919-274-7684
>
> On Aug 19, 2018, at 18:31, Timothy Naple <tnaple@BERKCOM.com> wrote:
>
> Christoper,
>
> aggr0 is still online just double degraded. Looks like you just need to
> put some spares in there and let it rebuild.
>
> Does the partner have any spares? (Run 'aggr status -s' on the partner)
>
> Or perhaps do you have any spares around or unassigned disks?
>
> Tim
>
> *From:* chris@dhinnovations.com
> *Sent:* August 19, 2018 3:23 PM
> *To:* tmacmd@gmail.com
> *Cc:* toasters@teaparty.net
> *Subject:* Re: Failed Aggregate 0
>
> Aggr State Status Options
> aggr0 online raid_dp, aggr root, diskroot,
> nosnap=off, raidtype=raid_dp,
> degraded raidsize=16,
> raid_lost_write=on,
> 64-bit
> ignore_inconsistent=off, snapmirrored=off,
> rlw_on resyncsnaptime=60,
> fs_size_fixed=off,
>
> lost_write_protect=on, no_delete_log=off,
> ha_policy=cfo,
> hybrid_enabled=off,
>
> percent_snapshot_space=0%,
>
> free_space_realloc=off, raid_cv=on,
> thorough_scrub=off
> Volumes: dcs_ds_01, dcs_iso, vol0
> Plex /aggr0/plex0: online, normal, active
> RAID group /aggr0/plex0/rg0: double degraded, block
> checksums
> RAID group /aggr0/plex0/rg1: double degraded, block
> checksums
> RAID group /aggr0/plex0/rg2: normal, block checksums
>
> ------------------------------
> *From:* tmac <tmacmd@gmail.com>
> *Sent:* Sunday, August 19, 2018 6:23:54 PM
> *To:* Christopher D. Chandler
> *Cc:* Toasters
> *Subject:* Re: Failed Aggregate 0
>
> You did not include many details.
> RAID 4 or RAID-DP? Sounds like you are using RAID4 if you have a double
> disk failure and you lost an aggregate.
> --tmac
>
> *Tim McCarthy, **Principal Consultant*
>
> *Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*
>
> *I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*
>
>
>
> On Sun, Aug 19, 2018 at 6:21 PM dhichandler <chris@dhinnovations.com>
> wrote:
>
>> I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
>> failure in one of our raid groups and looking for help recovering. The
>> unit
>> is out of warranty and support. Any help is appreciated.
>>
>>
>>
>> -----
>> Christopher Chandler
>> CEO
>> DH Innovations, LLC
>> --
>> Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
>> _______________________________________________
>> Toasters mailing list
>> Toasters@teaparty.net
>> http://www.teaparty.net/mailman/listinfo/toasters
>>
>
Re: Failed Aggregate 0 [ In reply to ]
When I try to boot it panics and says it has no root volume.

Chris Chandler
919-274-7684

On Aug 19, 2018, at 18:34, Timothy Naple <tnaple@BERKCOM.com<mailto:tnaple@BERKCOM.com>> wrote:

Chris,

It may have shutdown due to raid timeout if you did not replace the disks.

Just boot it back up and reassign the spares.

Tim

From: chris@dhinnovations.com<mailto:chris@dhinnovations.com>
Sent: August 19, 2018 3:27 PM
To: tnaple@BERKCOM.com<mailto:tnaple@BERKCOM.com>
Cc: tmacmd@gmail.com<mailto:tmacmd@gmail.com>; toasters@teaparty.net<mailto:toasters@teaparty.net>
Subject: Re: Failed Aggregate 0


The partner has six spares but the cluster is offline and the unit is sitting at the CFE prompt.

Chris Chandler
919-274-7684

On Aug 19, 2018, at 18:31, Timothy Naple <tnaple@BERKCOM.com<mailto:tnaple@BERKCOM.com>> wrote:

Christoper,

aggr0 is still online just double degraded. Looks like you just need to put some spares in there and let it rebuild.

Does the partner have any spares? (Run 'aggr status -s' on the partner)

Or perhaps do you have any spares around or unassigned disks?

Tim

From: chris@dhinnovations.com<mailto:chris@dhinnovations.com>
Sent: August 19, 2018 3:23 PM
To: tmacmd@gmail.com<mailto:tmacmd@gmail.com>
Cc: toasters@teaparty.net<mailto:toasters@teaparty.net>
Subject: Re: Failed Aggregate 0



Aggr State Status Options
aggr0 online raid_dp, aggr root, diskroot, nosnap=off, raidtype=raid_dp,
degraded raidsize=16, raid_lost_write=on,
64-bit ignore_inconsistent=off, snapmirrored=off,
rlw_on resyncsnaptime=60, fs_size_fixed=off,
lost_write_protect=on, no_delete_log=off,
ha_policy=cfo, hybrid_enabled=off,
percent_snapshot_space=0%,
free_space_realloc=off, raid_cv=on,
thorough_scrub=off
Volumes: dcs_ds_01, dcs_iso, vol0
Plex /aggr0/plex0: online, normal, active
RAID group /aggr0/plex0/rg0: double degraded, block checksums
RAID group /aggr0/plex0/rg1: double degraded, block checksums
RAID group /aggr0/plex0/rg2: normal, block checksums


________________________________
From: tmac <tmacmd@gmail.com<mailto:tmacmd@gmail.com>>
Sent: Sunday, August 19, 2018 6:23:54 PM
To: Christopher D. Chandler
Cc: Toasters
Subject: Re: Failed Aggregate 0

You did not include many details.
RAID 4 or RAID-DP? Sounds like you are using RAID4 if you have a double disk failure and you lost an aggregate.
--tmac

Tim McCarthy, Principal Consultant

Proud Member of the #NetAppATeam<https://twitter.com/NetAppATeam>

I Blog at TMACsRack<https://tmacsrack.wordpress.com/>



On Sun, Aug 19, 2018 at 6:21 PM dhichandler <chris@dhinnovations.com<mailto:chris@dhinnovations.com>> wrote:
I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
failure in one of our raid groups and looking for help recovering. The unit
is out of warranty and support. Any help is appreciated.



-----
Christopher Chandler
CEO
DH Innovations, LLC
--
Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters
Re: Failed Aggregate 0 [ In reply to ]
boot to maintenance mode.

Then try "disk show -a" and then "sysconfig -r" or "aggr status -r" and
"aggr status"

--tmac

*Tim McCarthy, **Principal Consultant*

*Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*

*I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*




On Sun, Aug 19, 2018 at 6:35 PM Christopher D. Chandler <
chris@dhinnovations.com> wrote:

> When I try to boot it panics and says it has no root volume.
>
> Chris Chandler
> 919-274-7684
>
> On Aug 19, 2018, at 18:34, Timothy Naple <tnaple@BERKCOM.com> wrote:
>
> Chris,
>
> It may have shutdown due to raid timeout if you did not replace the disks.
>
> Just boot it back up and reassign the spares.
>
> Tim
>
> *From:* chris@dhinnovations.com
> *Sent:* August 19, 2018 3:27 PM
> *To:* tnaple@BERKCOM.com
> *Cc:* tmacmd@gmail.com; toasters@teaparty.net
> *Subject:* Re: Failed Aggregate 0
>
> The partner has six spares but the cluster is offline and the unit is
> sitting at the CFE prompt.
>
> Chris Chandler
> 919-274-7684
>
> On Aug 19, 2018, at 18:31, Timothy Naple <tnaple@BERKCOM.com> wrote:
>
> Christoper,
>
> aggr0 is still online just double degraded. Looks like you just need to
> put some spares in there and let it rebuild.
>
> Does the partner have any spares? (Run 'aggr status -s' on the partner)
>
> Or perhaps do you have any spares around or unassigned disks?
>
> Tim
>
> *From:* chris@dhinnovations.com
> *Sent:* August 19, 2018 3:23 PM
> *To:* tmacmd@gmail.com
> *Cc:* toasters@teaparty.net
> *Subject:* Re: Failed Aggregate 0
>
> Aggr State Status Options
> aggr0 online raid_dp, aggr root, diskroot,
> nosnap=off, raidtype=raid_dp,
> degraded raidsize=16,
> raid_lost_write=on,
> 64-bit
> ignore_inconsistent=off, snapmirrored=off,
> rlw_on resyncsnaptime=60,
> fs_size_fixed=off,
>
> lost_write_protect=on, no_delete_log=off,
> ha_policy=cfo,
> hybrid_enabled=off,
>
> percent_snapshot_space=0%,
>
> free_space_realloc=off, raid_cv=on,
> thorough_scrub=off
> Volumes: dcs_ds_01, dcs_iso, vol0
> Plex /aggr0/plex0: online, normal, active
> RAID group /aggr0/plex0/rg0: double degraded, block
> checksums
> RAID group /aggr0/plex0/rg1: double degraded, block
> checksums
> RAID group /aggr0/plex0/rg2: normal, block checksums
>
> ------------------------------
> *From:* tmac <tmacmd@gmail.com>
> *Sent:* Sunday, August 19, 2018 6:23:54 PM
> *To:* Christopher D. Chandler
> *Cc:* Toasters
> *Subject:* Re: Failed Aggregate 0
>
> You did not include many details.
> RAID 4 or RAID-DP? Sounds like you are using RAID4 if you have a double
> disk failure and you lost an aggregate.
> --tmac
>
> *Tim McCarthy, **Principal Consultant*
>
> *Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*
>
> *I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*
>
>
>
> On Sun, Aug 19, 2018 at 6:21 PM dhichandler <chris@dhinnovations.com>
> wrote:
>
>> I have a FAS3140 running OnTap 8.2.5 and I have had a double partity disk
>> failure in one of our raid groups and looking for help recovering. The
>> unit
>> is out of warranty and support. Any help is appreciated.
>>
>>
>>
>> -----
>> Christopher Chandler
>> CEO
>> DH Innovations, LLC
>> --
>> Sent from: http://network-appliance-toasters.10978.n7.nabble.com/
>> _______________________________________________
>> Toasters mailing list
>> Toasters@teaparty.net
>> http://www.teaparty.net/mailman/listinfo/toasters
>>
>
Re: Failed Aggregate 0 [ In reply to ]
Ok, I have seen the last ASUP via this EMAIL and the current disk layout.
Here they are for everyone:
Last known ASUP:

Aggregate aggr0 (online, raid_dp, degraded) (block checksums)

Plex /aggr0/plex0 (online, normal, active)

RAID group /aggr0/plex0/rg0 (double degraded, block checksums)


RAID Disk Device Model Number Serial Number VBN Start
VBN End

--------- ------ ------------ ------------- ---------
-------

dparity 0a.44 X279_HVPBP288F15 JLVBYWVC -
-

parity 0a.48 X279_S15K5288F15 3LM42GHN00009839N66A -
-

data 0a.49 X279_S15K5288F15 3LM1TA6C000098080J6N 0
69626751

data 0a.16 X279_HVPBP288F15 JLVJV9TC 69626752
139253503

data 0a.50 X279_HVIPB288F15 J8YMUD4C 139253504
208880255

data 0a.45 X279_S15K7288F15 6SJ5HVVK0000B24505C9 208880256
278507007

data 0a.51 X279_S15K5288F15 3LM1T7DZ00009808S8XF 348133760
417760511

data 0a.18 X279_HVPBP288F15 JLVJVENC 417760512
487387263

data 0a.52 X279_HVPBP288F15 JLVHYUNC 487387264
557014015

data 0a.19 X279_HVPBP288F15 JLVJWUUC 557014016
626640767

data 0a.20 X279_HVPBP288F15 JLVJWJ3C 626640768
696267519

data 0a.22 X279_HVPBP288F15 JLVK7NJC 765894272
835521023

data 0a.23 X279_HVPBP288F15 JLVJUE9C 835521024
905147775

data 0a.25 X279_HVPBP288F15 JLVJWKRC 905147776
974774527


RAID group /aggr0/plex0/rg1 (double degraded, block checksums)


RAID Disk Device Model Number Serial Number VBN Start
VBN End

--------- ------ ------------ ------------- ---------
-------

data 0a.33 X279_HVPBP288F15 JLVJUENC 974774528
1044401279

data 0a.26 X279_HVPBP288F15 JLVKVL7C 1044401280
1114028031

data 0a.34 X279_HVPBP288F15 JLVBYUWC 1114028032
1183654783

data 0a.27 X279_HVPBP288F15 JLVBZL3C 1183654784
1253281535

data 0a.35 X279_HVPBP288F15 JLVBYVYC 1253281536
1322908287

data 0a.28 X279_HVPBP288F15 JLVJV9RC 1322908288
1392535039

data 0a.36 X279_HVPBP288F15 JLVBYSBC 1392535040
1462161791

data 0a.29 X279_HVPBP288F15 JLVJWP5C 1462161792
1531788543

data 0a.37 X279_HVPBP288F15 JLVBYS1C 1531788544
1601415295

data 0a.38 X279_HVPBP288F15 JLVBYW7C 1601415296
1671042047

data 0a.39 X279_HVPBP288F15 JLVBYWWC 1671042048
1740668799

data 0a.40 X279_HVPBP288F15 JLVBYRRC 1740668800
1810295551

data 0a.41 X279_HVPBP288F15 JLVBYPWC 1810295552
1879922303

data 0a.60 X279_HVPBP288F15 JLVJ6YZC 1879922304
1949549055


RAID group /aggr0/plex0/rg2 (normal, block checksums)


RAID Disk Device Model Number Serial Number VBN Start
VBN End

--------- ------ ------------ ------------- ---------
-------

dparity 0a.54 X279_S15K5288F15 3LM42HL100009839NXS7 -
-

parity 0a.55 X279_S15K5288F15 3LM45JAD00009839N5JZ -
-

data 0a.56 X279_S15K5288F15 3LM47SPT00009840R2EL 1949549056
2019175807

data 0a.57 X279_HVIPB288F15 J8YT82MC 2019175808
2088802559

data 0a.58 X279_HVPBP288F15 JLVGNHKC 2088802560
2158429311

data 0a.59 X279_HVPBP288F15 JLVJGJVC 2158429312
2228056063



And here is the "aggr status -r" output from MAINT mode:


Aggregate aggr0 (failed, Aug 19 22:47:17 [localhost:disk.failmsg:error]:
Disk 0a.21 (JLVKV9SC): non-persistent message received. 0 [NETAPP
X279_HVPBP288F15 NA02] S/N [JLVKV9SC]


raid_dp, partial) (block checksums)

Aug 19 22:47:17 [localhost:disk.failmsg:error]: Disk 0a.32 (J8VYJBHC):
non-persistent message received. 0 [NETAPP X279_HVIPB288F15 NA01] S/N
[J8VYJBHC]


Plex /aggr0/plex0 (offline, failed, inactive)

Aug 19 22:47:17 [localhost:raid.fdr.failed.ok:info]: Disk 0a.21 Shelf 1 Bay
5 [NETAPP X279_HVPBP288F15 NA02] S/N [JLVKV9SC] successfully deleted from
spare pool


RAID group /aggr0/plex0/rg0 (partial, block checksums)

Aug 19 22:47:17 [localhost:raid.fdr.failed.ok:info]: Disk 0a.32 Shelf 2 Bay
0 [NETAPP X279_HVIPB288F15 NA01] S/N [J8VYJBHC] successfully deleted from
spare pool



RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks)
Phys (MB/blks)

--------- ------ ------------- ---- ---- ---- ----- --------------
--------------

dparity FAILED N/A 272000/ -

parity 0a.48 0a 3 0 FC:A 0 FCAL 15000
272000/557056000 280104/573653840

data 0a.49 0a 3 1 FC:A 0 FCAL 15000
272000/557056000 280104/573653840

data 0a.16 0a 1 0 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.50 0a 3 2 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.45 0a 2 13 FC:A 0 FCAL 15000
272000/557056000 280104/573653840

data FAILED N/A 272000/ -

data 0a.51 0a 3 3 FC:A 0 FCAL 15000
272000/557056000 280104/573653840

data 0a.18 0a 1 2 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.52 0a 3 4 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.19 0a 1 3 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.20 0a 1 4 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data FAILED N/A 272000/ -

data 0a.22 0a 1 6 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.23 0a 1 7 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.25 0a 1 9 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

Raid group is missing 3 disks.


RAID group /aggr0/plex0/rg1 (double degraded, block checksums)


RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks)
Phys (MB/blks)

--------- ------ ------------- ---- ---- ---- ----- --------------
--------------

dparity FAILED N/A 272000/ -

parity FAILED N/A 272000/ -

data 0a.33 0a 2 1 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.26 0a 1 10 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.34 0a 2 2 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.27 0a 1 11 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.35 0a 2 3 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.28 0a 1 12 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.36 0a 2 4 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.29 0a 1 13 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.37 0a 2 5 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.38 0a 2 6 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.39 0a 2 7 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.40 0a 2 8 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.41 0a 2 9 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.60 0a 3 12 FC:A 0 FCAL 15000
272000/557056000 274845/562884296


RAID group /aggr0/plex0/rg2 (degraded, block checksums)


RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks)
Phys (MB/blks)

--------- ------ ------------- ---- ---- ---- ----- --------------
--------------

dparity 0a.54 0a 3 6 FC:A 0 FCAL 15000
272000/557056000 280104/573653840

parity FAILED N/A 272000/ -

data 0a.56 0a 3 8 FC:A 0 FCAL 15000
272000/557056000 280104/573653840

data 0a.57 0a 3 9 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.58 0a 3 10 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

data 0a.59 0a 3 11 FC:A 0 FCAL 15000
272000/557056000 274845/562884296


Unassimilated aggr0 disks


RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks)
Phys (MB/blks)

--------- ------ ------------- ---- ---- ---- ----- --------------
--------------

orphaned 0a.72 0a 4 8 FC:A 0 FCAL 15000
272000/557056000 280104/573653840




What is seemingly ODD to me is the FAILED disks and they way they are
displayed. See the Highlighted entries above.
From the ASUP message we know that 0a.33 and 0a.26 are dparity and parity
but they are shifted and showing as data.
In the last RIAD GROUP 0a.55 is failed and showing. Weird.

My GUT is telling me some sort of disk enumeration bug is being encountered
on this version of ONTAP. Unless someone has actually been through this
before, this is going to require opening a case with NetApp in some form or
fashion. There may need to be some lower level disk label corrections that
need to occur but I am not 100% sure on this.

Did the drives all fail at once? Was it one at a time and never replaced?
just curious.

You *might* be able to get a one-time support call. If you are willing to
pony up the $$ to re-enable support, they might help you out now for free.
OR you can purchase a one-time support. I think I have heard is like
$5k-$10k or something.

--tmac
Re: Failed Aggregate 0 [ In reply to ]
That might just be a bug with the "aggr status -r" output, not sure, I'd
have to research it more. However, the "sysconfig -r" shows rg1 correctly.

RAID group /aggr0/plex0/rg1 (double degraded, block checksums)

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used
(MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- -----
-------------- --------------
dparity FAILED N/A 272000/ -
parity FAILED N/A 272000/ -
data 0a.33 0a 2 1 FC:A 0 FCAL 15000
272000/557056000 274845/562884296
data 0a.26 0a 1 10 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

Anyway, comparing the last ASUP to the output from maintenance mode, it
looks like the last disk to fail was 0a.44. When that disk failed it took
the aggr offline due to 3 failed disks in a single RAID group.

LAST ASUP:
Aggregate aggr0 (online, raid_dp, degraded) (block checksums)
Plex /aggr0/plex0 (online, normal, active)
RAID group /aggr0/plex0/rg0 (double degraded, block checksums)

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used
(MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- -----
-------------- --------------
dparity 0a.44 0a 2 12 FC:A 0 FCAL 15000
272000/557056000 274845/562884296
parity 0a.48 0a 3 0 FC:A 0 FCAL 15000
272000/557056000 280104/573653840
data 0a.49 0a 3 1 FC:A 0 FCAL 15000
272000/557056000 280104/573653840
data 0a.16 0a 1 0 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

MAINT MODE:

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used
(MB/blks) Phys (MB/blks)

--------- ------ ------------- ---- ---- ---- -----
-------------- --------------

dparity FAILED N/A 272000/ -

parity 0a.48 0a 3 0 FC:A 0 FCAL 15000
272000/557056000 280104/573653840

data 0a.49 0a 3 1 FC:A 0 FCAL 15000
272000/557056000 280104/573653840

data 0a.16 0a 1 0 FC:A 0 FCAL 15000
272000/557056000 274845/562884296

IF, and that's a big IF...you can unfail disk 0a.44 you might be able to
get the aggr back online. Now, once the aggr is online and the controller
boots up you're gonna want to have some spares in there so some
reconstructs can start. I would expect disk 0a.44 to fail again at some
point in the near future. Hopefully you can get it to stay online long
enough for some recons to finish in rg0. Otherwise, you're looking at a
panic and the controller going down again.

What's your spare situation on the partner controller? Can you assign a
few to this controller?
Also, do you have backups? (just in case)