Mailing List Archive

Snapshot storage
Hi,

since about 1 year we have a F760 with now 3 TB of Raw Storage.
During that time a snap reserve of 10% was sufficent. (4 a day, 7 per week,
4 per month)
But now i found that on one volume the snapshot demand was suddenly growing.

After looking around i found some users using their quota for doing backups
to our filer.
That backups were deleted after 2 weeks and replaced by new backups.
This led to a large amount of data invisible to the quota system, but
wasting a lot of snapstorage.

How can i deal with that situation and especialy: how to find users/qtrees
with a lot of changed data?

Regards
Stefan holzwarth


----------------------------------------------------------------------------
--
Stefan Holzwarth
ADAC e.V. (Informationsverarbeitung - Systemtechnik - Basisdienste)
Am Westpark 8, 81373 München, Tel.: (089) 7676-5212, Fax: (089) 76768924
mailto:stefan.holzwarth@zentrale.adac.de
RE: Snapshot storage [ In reply to ]
Starting with the 6.2 release, we added a feature to help with this problem, at least for CIFS clients. If you turn on the cifs.snapshot_file_folding.enable option, the filer will compare newly written files with the same file in the last snapshot, and any identical blocks will not result in new allocation of space. The effect of this is to only allocate storage for blocks that have changed. This is also useful when you are doing SnapMirror, since only the changed blocks have to be transmitted.

Naturally, there is some performance impact to this, but it seems to be a good tradeoff for many customer environments.

Mark Muhlestein -- Network Appliance Engineering

> -----Original Message-----
> From: stefan.holzwarth@zentrale.adac.de
> [mailto:stefan.holzwarth@zentrale.adac.de]
> Sent: Tuesday, August 06, 2002 1:08 AM
> To: toasters@mathworks.com
> Subject: Snapshot storage
>
>
> Hi,
>
> since about 1 year we have a F760 with now 3 TB of Raw Storage.
> During that time a snap reserve of 10% was sufficent. (4 a
> day, 7 per week,
> 4 per month)
> But now i found that on one volume the snapshot demand was
> suddenly growing.
>
> After looking around i found some users using their quota for
> doing backups
> to our filer.
> That backups were deleted after 2 weeks and replaced by new backups.
> This led to a large amount of data invisible to the quota system, but
> wasting a lot of snapstorage.
>
> How can i deal with that situation and especialy: how to find
> users/qtrees
> with a lot of changed data?
>
> Regards
> Stefan holzwarth
>
>
> --------------------------------------------------------------
> --------------
> --
> Stefan Holzwarth
> ADAC e.V. (Informationsverarbeitung - Systemtechnik - Basisdienste)
> Am Westpark 8, 81373 München, Tel.: (089) 7676-5212, Fax:
> (089) 76768924
> mailto:stefan.holzwarth@zentrale.adac.de
>
Re: Snapshot storage [ In reply to ]
On Tue, Aug 06, 2002 at 11:14:29AM -0700, Muhlestein, Mark wrote:
> Starting with the 6.2 release, we added a feature to help with this problem,
> at least for CIFS clients. If you turn on the
> cifs.snapshot_file_folding.enable option, the filer will compare newly written
> files with the same file in the last snapshot, and any identical blocks will
> not result in new allocation of space. The effect of this is to only allocate
> storage for blocks that have changed. This is also useful when you are doing
> SnapMirror, since only the changed blocks have to be transmitted.

Is there any chance of adding this feature for NFS clients? It might
help us with an issue we have on one of our filers.

--
Deron Johnson
djohnson@amgen.com
Re: Snapshot storage [ In reply to ]
stefan.holzwarth@zentrale.adac.de wrote:

>...
>
>After looking around i found some users using their quota for doing backups
>to our filer.
>That backups were deleted after 2 weeks and replaced by new backups.
>This led to a large amount of data invisible to the quota system, but
>wasting a lot of snapstorage.
>
>How can i deal with that situation and especialy: how to find users/qtrees
>with a lot of changed data?
>
Hello Stefan

1) As the others already stated: For (not just ... :-) ) CIFS there is
this new filefolding option. If you take a closer look at the "options"
manpage explaining this new ONTAP 6.2 feature you can detect, that it
will work on any file getting closed without making any difference
wether it is a CIFS or NFS file close => It works for NFS, too. It was
just named, CIFS option because there it is needed by this
"document.xxx gets renamed to document.bak and then write document.xxx
again" behaviour of MS-Applications... You usually don't need it if you
have NFS Clients .... and ... if you are not having users that
explicitely rewrite their data over and over again .. like your users
do. :-(

Maybe NetApp can/will? change this option to a wafl-option? cifs ->
wafl.snapshot_file_folding.enable


2) The command "filestats" should help you to find those "bad" users?
It can also produce tables and HTML output. :-)
If you modify example 3 for your wanted limitations (limitation on
mod-ages) you should find the "bad" users.

Best regards from Munich/Germany.
Dirk Schmiedt
... can be found as NetApp-Trainer on www.qskills.de



NAME

filestats - collect file usage statistics

SYNOPSIS

filestats [ ages ages] [ timetype {a,m,c,cr}] [ sizes sizes] snapshot
snapshot_name [ style style] [ volume volume_name]

DESCRIPTION

The filestats utility provides a summary of file usage within a
volume. It must be used on a snapshot, and the only required argument is
the snapshot name. The
volume name defaults to "vol0" if not specified. If the volume you
are examining is named otherwise, specify the name explicitly.


....

EXAMPLES

1. Produce default file usage breakdowns for snapshot hourly.1 of
volume vol0.

filestats volume vol0 snapshot hourly.1

2. Produce file usage breakdowns by monthly age values:

filestats volume vol0 snapshot hourly.1 ages "30D,60D,90D,120D,150D,180D"

3. Produce file usage breakdowns for inodes whose size is less than
100000 bytes and whose access time is less than a day old:

filestats volume vol0 snapshot hourly.1 expr
"{size}<100000&&{atimeage}<86400)"

3. Produce a breakdown of the total number of files and their total
size. You can control the set of ages and sizes that get used for this
breakdown, with the
"ages" and "sizes" arguments. The output also contains a breakdown of
file usage by user-id and group-id.

filestats snapshot hourly.1 volume vol0





A very very simple output example without any search limitations, from
an used filer, currently having only one user: root=0


filer7*> filestats snapshot 1
VOL=vol0 SNAPSHOT=1
INODES=543328 COUNTED_INODES=906 TOTAL_BYTES=317956057 TOTAL_KB=42184

FILE SIZE CUMULATIVE COUNT CUMULATIVE TOTAL KB
1K 465 1580
10K 853 3464
100K 882 4500
1M 896 8696
10M 901 24164
100M 906 42184
1G 906 42184
MAX 906 42184

AGE(ATIME) CUMULATIVE COUNT CUMULATIVE TOTAL KB
0 0 0
30D 906 42184
60D 906 42184
90D 906 42184
120D 906 42184
MAX 906 42184

UID COUNT TOTAL KB
#0 906 42184

GID COUNT TOTAL KB
#0 906 42184
Re: Snapshot storage [ In reply to ]
> From owner-toasters@mathworks.com Wed Aug 7 12:54 MDT 2002
> From: Dirk Schmiedt <Dirk.Schmiedt@munich.netsurf.de>
>
>
> 2) The command "filestats" should help you to find those "bad" users?
> It can also produce tables and HTML output. :-)
> If you modify example 3 for your wanted limitations (limitation on
> mod-ages) you should find the "bad" users.
>
> Best regards from Munich/Germany.
> Dirk Schmiedt
> ... can be found as NetApp-Trainer on www.qskills.de
>
>
>
> NAME
>
> filestats - collect file usage statistics
>
> SYNOPSIS
>
> filestats [ ages ages] [ timetype {a,m,c,cr}] [ sizes sizes] snapshot
> snapshot_name [ style style] [ volume volume_name]
>
> DESCRIPTION
>
> The filestats utility provides a summary of file usage within a
> volume. It must be used on a snapshot, and the only required argument is
> the snapshot name. The
> volume name defaults to "vol0" if not specified. If the volume you
> are examining is named otherwise, specify the name explicitly.


This looks cool ... and awful familier! ;-)

I wrote a Perl program that parses a filesystem(s) and generates similar
types of reports on pretty much everything reported by stat(). So if
you want another "flavor" of this, check out yadu available at:
http://www.komar.org/
-> Misc. Tech Stuff
-> yadu

Some sample output can also be seen there - almost eery how similar! ;-)

alek
RE: Snapshot storage [ In reply to ]
The snapshot folding happens in WAFL of course, but the trigger to cause a file to be scanned happens based on CIFS requests. If an NFS client overwrites a file with identical data it will result in new allocations. If a CIFS client does that when the folding option is "on" it will not result in new allocations if the old blocks are in the last snapshot. This works even if the data is written to a temp file which is subsequently renamed to the original file, as Windows applications typically do.

At some point we plan to add support for NFS, and/or to add a way to manually force a folding scan on a directory, etc. If people are interested in this let me know so I can add your comments to the feature request.

Mark Muhlestein -- Network Appliance Engineering



> -----Original Message-----
> From: Dirk Schmiedt [mailto:Dirk.Schmiedt@munich.netsurf.de]
> Sent: Wednesday, August 07, 2002 10:05 AM
> To: stefan.holzwarth@zentrale.adac.de
> Cc: toasters@mathworks.com
> Subject: Re: Snapshot storage
>
>
> stefan.holzwarth@zentrale.adac.de wrote:
>
> >...
> >
> >After looking around i found some users using their quota
> for doing backups
> >to our filer.
> >That backups were deleted after 2 weeks and replaced by new backups.
> >This led to a large amount of data invisible to the quota system, but
> >wasting a lot of snapstorage.
> >
> >How can i deal with that situation and especialy: how to
> find users/qtrees
> >with a lot of changed data?
> >
> Hello Stefan
>
> 1) As the others already stated: For (not just ... :-) ) CIFS
> there is
> this new filefolding option. If you take a closer look at the
> "options"
> manpage explaining this new ONTAP 6.2 feature you can detect, that it
> will work on any file getting closed without making any difference
> wether it is a CIFS or NFS file close => It works for NFS,
> too. It was
> just named, CIFS option because there it is needed by this
> "document.xxx gets renamed to document.bak and then write
> document.xxx
> again" behaviour of MS-Applications... You usually don't
> need it if you
> have NFS Clients .... and ... if you are not having users that
> explicitely rewrite their data over and over again .. like your users
> do. :-(
>
> Maybe NetApp can/will? change this option to a wafl-option? cifs ->
> wafl.snapshot_file_folding.enable
Re: Snapshot storage [ In reply to ]
Muhlestein, Mark wrote:

>The snapshot folding happens in WAFL of course, but the trigger to cause a file to be scanned happens based on CIFS requests. If an NFS client overwrites a file with identical data it will result in new allocations. If a CIFS client does that when the folding option is "on" it will not result in new allocations if the old blocks are in the last snapshot. This works even if the data is written to a temp file which is subsequently renamed to the original file, as Windows applications typically do.
>
>At some point we plan to add support for NFS, and/or to add a way to manually force a folding scan on a directory, etc. If people are interested in this let me know so I can add your comments to the feature request.
>
>Mark Muhlestein -- Network Appliance Engineering
>
Hello Mark

You are right. File folding currenty really is just CIFS not NFS. I read
the manpage too fast. Sorry.

I think, this would be a cool feature for NFS-files, too. Especially for
those "I backup my local PCs files to the filer" guys ...
One more request for enhancement: Due to the performance impact of file
folding (ff) ...: How about moving this ff option from a system wide
range to a volume based option?
Therefore I could create one volume for NFS-Backups and CIFS users with
the low(er) ff performance and one without ff for the other
NFS-clients...

Just my 2 cents.

Best regards!
Dirk
Re: Snapshot storage [ In reply to ]
On Thu, Aug 08, 2002 at 12:20:20AM +0200, Dirk Schmiedt wrote:
> >At some point we plan to add support for NFS, and/or to add a way to
> >manually force a folding scan on a directory, etc. If people are
> >interested in this let me know so I can add your comments to the feature
> >request.
> How about moving this ff option from a system wide
> range to a volume based option?

Hey, don't shorten their vision, they wanted to do it per directory ;)

(I'd like to have it per qtree)

p.
RE: Snapshot storage [ In reply to ]
At the moment the option is per-vfiler, which means if you are using MultiStore you can have different policies for each vfiler.

Thanks again for the comments and suggestions. I am keeping track of them.

Mark

-----Original Message-----
From: Piotr KUCHARSKI [mailto:chopin@sgh.waw.pl]
Sent: Wednesday, August 07, 2002 5:44 PM
To: toasters@mathworks.com
Subject: Re: Snapshot storage

On Thu, Aug 08, 2002 at 12:20:20AM +0200, Dirk Schmiedt wrote:
> >At some point we plan to add support for NFS, and/or to add a way to
> >manually force a folding scan on a directory, etc. If people are
> >interested in this let me know so I can add your comments to the feature
> >request.
> How about moving this ff option from a system wide
> range to a volume based option?

Hey, don't shorten their vision, they wanted to do it per directory ;)

(I'd like to have it per qtree)

p.