Mailing List Archive: SOLVED: snapmirror source and target aggregate usage don't match?

Hello All,

This provided an opportunity to find and implement some improvements in our dedupe/snapmirror configuration but that wasn't the real problem.

We disabled dedupe on all but the two SVMs that were showing space savings of >10% or a couple of hundred GB, and changed the dedupe schedule to run hourly and with ample time to finish before the hourly snapmirror schedule.

The real problem turned out to be that while we configure all our data volumes on the source cluster as thin provisioned, when DOT creates the target volume on the secondary "thick." And as the Storage Efficiency settings aren't exposed anywhere in the GUI on the secondary cluster for those volumes, it took some digging and five people staring at the CLI output before we figured this out.

We changed all the target volumes to thin and will check/change new ones as they're created

Life is good.

Randy

From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Rue, Randy
Sent: Saturday, December 17, 2016 11:14 AM
To: toasters@teaparty.net
Subject: RE: snapmirror source and target aggregate usage don't match?

Thank you for a NA docs pointer from a list member and the opportunity for a little RTFM before I reply to the list!

Looks like my colleague's recollection of earlier versions still applies to 8.3. Essentially, we've been snapmirroring un-deduped data.

My next question is, is it realistic to run dedupes hourly on 30-40 volumes totalling ~100TB? Because that's a much easier proposition than amending our SLAs to lengthen our snapshot cycles.

And to get both on the same cycle, is it possible to make snapshots dependent on dedupe finishing, or do we just assume dedupe will complete, and if so, if it doesn't, what are the consequences? For example, if a dedupe that usually finishes at 5 minutes after the hour isn't done, and snapmirror runs then, will snapmirror then be syncing a full hour of full-sized changes?

Last question for now: assuming both are on the same schedule, how do I get the current lost space back? Will it be reclaimed when the schedules are synced and the snapshots have rolled off? Or do I need to destroy and recreate the target volumes?

Hope to hear from you, especially from any other shop running both snapmirror and dedupe.

Randy

(and if a solution requires DOT9 we do have an upgrade on our roadmap)

Replying to just toy for now. Hopefully this will help and you can report back
Look at this...
https://library.netapp.com/ecm/ecm_download_file/ECMLP2348026

Page 142.

I think you may need to work out some scheduling and that may help. Going to look a bit more...i have another idea, but that may only be ONTAP 9 related.

Get Outlook for iOS<https://aka.ms/o0ukef>

_____________________________
From: Rue, Randy <rrue@fredhutch.org<mailto:rrue@fredhutch.org>>
Sent: Saturday, December 17, 2016 11:14 AM
Subject: snapmirror source and target aggregate usage don't match?
To: <toasters@teaparty.net<mailto:toasters@teaparty.net>>

Hello All,

We run two 8.3 filers with a list of vservers and their associated volumes, with each volume snapmirrored (volume level) from the active primary cluster to matching vserver/volumes on the passive secondary.

Both clusters have a similar set of aggregates of just about equal size. Both clusters' aggregates contain the same list of volumes of the same size, with the same space total/used/available on both sets.

But on the target cluster the same aggregates are reporting 30% more used space.

This is about on par with the dedupe savings we're getting on the primary so when I first noticed this my thought was to check that dedupe was OK on the target. But if you look in the webUI, it reports that no "storage efficiency" is available on a replication target, and ended up thinking this meant that the secondary data would have to be full-sized. I even recall asking someone and having this confirmed, but can't recall if that came from the vendor SE or our VAR SE or a support tech or.

Now we're approaching the space limit of the secondary cluster and I'm looking deeper. At this point, as it appears that for each volume the total/used/free space matches after dedupe on the source, I'm thinking that dedupe properties aren't exposed on the target but the data is still a true copy of the deduped original. This is supported by being able to view dedupe stats on the target via the CLI that show the same savings as on the source.

Note that we're also snapshotting these volumes, and while we're deduping daily, we're snapshotting hourly. A colleague mentioned remembering that this could mean mirrored data that's not deduped yet is being replicated full-size. But if so, wouldn't this be reflected in the dedupe stats on the target?

OK, just found that "storage aggregate show -fields usedsize,physical-used" on the primary/source cluster shows that used and physical-used are about identical for all aggrs. On the secondary/target, used is consistently larger than physical-used and the total difference makes up the 30% I'm "missing."

Is this a problem with my reporting? Are we actually OK and I need to look at physical-used instead of used? Or if we're not OK, where is the space being used and can I get it back?

Thanks in advance for your guidance...

Randy

Interesting. I would have thought the destination would have been thin. Do u really need to run dedupe everyday? I used to run only 1x per week on my VMware server non vdi vols. Was mostly a waste of CPU and not enough savings to justify.

Thanks for the update.

Sent from my iPhone

On Dec 23, 2016, at 8:54 AM, Rue, Randy <rrue@fredhutch.org<mailto:rrue@fredhutch.org>> wrote:

Hello All,

This provided an opportunity to find and implement some improvements in our dedupe/snapmirror configuration but that wasn’t the real problem.

We disabled dedupe on all but the two SVMs that were showing space savings of >10% or a couple of hundred GB, and changed the dedupe schedule to run hourly and with ample time to finish before the hourly snapmirror schedule.

The real problem turned out to be that while we configure all our data volumes on the source cluster as thin provisioned, when DOT creates the target volume on the secondary “thick.” And as the Storage Efficiency settings aren’t exposed anywhere in the GUI on the secondary cluster for those volumes, it took some digging and five people staring at the CLI output before we figured this out.

We changed all the target volumes to thin and will check/change new ones as they’re created

Life is good.

Randy

From: toasters-bounces@teaparty.net<mailto:toasters-bounces@teaparty.net> [mailto:toasters-bounces@teaparty.net] On Behalf Of Rue, Randy
Sent: Saturday, December 17, 2016 11:14 AM
To: toasters@teaparty.net<mailto:toasters@teaparty.net>
Subject: RE: snapmirror source and target aggregate usage don't match?

Thank you for a NA docs pointer from a list member and the opportunity for a little RTFM before I reply to the list!

Looks like my colleague’s recollection of earlier versions still applies to 8.3. Essentially, we've been snapmirroring un-deduped data.

My next question is, is it realistic to run dedupes hourly on 30-40 volumes totalling ~100TB? Because that's a much easier proposition than amending our SLAs to lengthen our snapshot cycles.

And to get both on the same cycle, is it possible to make snapshots dependent on dedupe finishing, or do we just assume dedupe will complete, and if so, if it doesn't, what are the consequences? For example, if a dedupe that usually finishes at 5 minutes after the hour isn't done, and snapmirror runs then, will snapmirror then be syncing a full hour of full-sized changes?

Last question for now: assuming both are on the same schedule, how do I get the current lost space back? Will it be reclaimed when the schedules are synced and the snapshots have rolled off? Or do I need to destroy and recreate the target volumes?

Hope to hear from you, especially from any other shop running both snapmirror and dedupe.

Randy

(and if a solution requires DOT9 we do have an upgrade on our roadmap)

Replying to just toy for now. Hopefully this will help and you can report back
Look at this...
https://library.netapp.com/ecm/ecm_download_file/ECMLP2348026

Page 142.

I think you may need to work out some scheduling and that may help. Going to look a bit more...i have another idea, but that may only be ONTAP 9 related.

Get Outlook for iOS<https://aka.ms/o0ukef>

_____________________________
From: Rue, Randy <rrue@fredhutch.org<mailto:rrue@fredhutch.org>>
Sent: Saturday, December 17, 2016 11:14 AM
Subject: snapmirror source and target aggregate usage don't match?
To: <toasters@teaparty.net<mailto:toasters@teaparty.net>>

Hello All,

We run two 8.3 filers with a list of vservers and their associated volumes, with each volume snapmirrored (volume level) from the active primary cluster to matching vserver/volumes on the passive secondary.

Both clusters have a similar set of aggregates of just about equal size. Both clusters’ aggregates contain the same list of volumes of the same size, with the same space total/used/available on both sets.

But on the target cluster the same aggregates are reporting 30% more used space.

This is about on par with the dedupe savings we’re getting on the primary so when I first noticed this my thought was to check that dedupe was OK on the target. But if you look in the webUI, it reports that no “storage efficiency” is available on a replication target, and ended up thinking this meant that the secondary data would have to be full-sized. I even recall asking someone and having this confirmed, but can’t recall if that came from the vendor SE or our VAR SE or a support tech or.

Now we’re approaching the space limit of the secondary cluster and I’m looking deeper. At this point, as it appears that for each volume the total/used/free space matches after dedupe on the source, I’m thinking that dedupe properties aren’t exposed on the target but the data is still a true copy of the deduped original. This is supported by being able to view dedupe stats on the target via the CLI that show the same savings as on the source.

Note that we’re also snapshotting these volumes, and while we’re deduping daily, we’re snapshotting hourly. A colleague mentioned remembering that this could mean mirrored data that’s not deduped yet is being replicated full-size. But if so, wouldn’t this be reflected in the dedupe stats on the target?

OK, just found that “storage aggregate show -fields usedsize,physical-used” on the primary/source cluster shows that used and physical-used are about identical for all aggrs. On the secondary/target, used is consistently larger than physical-used and the total difference makes up the 30% I’m “missing.”

Is this a problem with my reporting? Are we actually OK and I need to look at physical-used instead of used? Or if we’re not OK, where is the space being used and can I get it back?

Thanks in advance for your guidance…

Randy

_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters