Mailing List Archive: file folding impact

file folding impact

Dec 11, 2002, 6:20 AM

Post #1 of 12 (4144 views)

Hi, I'm thinking about enabling file folding on one of our filers.
Have anybody tried this? If so, what kind of performance decrease/size gain
can be expected from enabling it?

---- Mats

AW: file folding impact [ In reply to ]

stefan.holzwarth at adac

Dec 11, 2002, 6:37 AM

Post #2 of 12 (4104 views)

Permalink

Since 2 months we use filefolding on all volumes. (F760,2TB,3000 User, only
cifs) No remarkable impact in cpu usage. No trouble.
Snapshotsize decreasing a little.
Regards Stefan Holzwarth

-----Ursprüngliche Nachricht-----
Von: Öberg Mats [mailto:mats.oberg@tietoenator.com]
Gesendet: Mittwoch, 11. Dezember 2002 14:20
An: toasters@mathworks.com
Betreff: file folding impact

Hi, I'm thinking about enabling file folding on one of our filers.
Have anybody tried this? If so, what kind of performance decrease/size gain
can be expected from enabling it?

---- Mats

RE: file folding impact [ In reply to ]

rahul.kumar at eds

Dec 11, 2002, 9:19 PM

Post #3 of 12 (4107 views)

Permalink

BTW what does this option do

Rahul

-----Original Message-----
From: stefan.holzwarth@adac.de [mailto:stefan.holzwarth@adac.de]
Sent: Wednesday, December 11, 2002 7:08 PM
To: mats.oberg@tietoenator.com; toasters@mathworks.com
Subject: AW: file folding impact

Since 2 months we use filefolding on all volumes. (F760,2TB,3000 User, only
cifs) No remarkable impact in cpu usage. No trouble. Snapshotsize decreasing a little. Regards Stefan Holzwarth

-----Ursprüngliche Nachricht-----
Von: Öberg Mats [mailto:mats.oberg@tietoenator.com]
Gesendet: Mittwoch, 11. Dezember 2002 14:20
An: toasters@mathworks.com
Betreff: file folding impact

Hi, I'm thinking about enabling file folding on one of our filers.
Have anybody tried this? If so, what kind of performance decrease/size gain can be expected from enabling it?

---- Mats

RE: file folding impact [ In reply to ]

Mitko.Blazeski at proact

Dec 12, 2002, 2:10 AM

Post #4 of 12 (4106 views)

Permalink

File folding describes the process of checking the data in the most recent
snapshot, and, if it is identical to the snapshot currently being created,
just referencing the previous snapshot instead of taking up disk space
writing the same data in the new snapshot. Disk space is saved by sharing
unchanged file blocks between the active version of the file and the version
of the file in the latest snapshot, if any.

/Mitko

-----Original Message-----
From: Kumar, Rahul [mailto:rahul.kumar@eds.com]
Sent: den 12 december 2002 05:19
To: stefan.holzwarth@adac.de; mats.oberg@tietoenator.com;
toasters@mathworks.com
Subject: RE: file folding impact

BTW what does this option do

Rahul

-----Original Message-----
From: stefan.holzwarth@adac.de [mailto:stefan.holzwarth@adac.de]
Sent: Wednesday, December 11, 2002 7:08 PM
To: mats.oberg@tietoenator.com; toasters@mathworks.com
Subject: AW: file folding impact

Since 2 months we use filefolding on all volumes. (F760,2TB,3000 User, only
cifs) No remarkable impact in cpu usage. No trouble. Snapshotsize decreasing
a little. Regards Stefan Holzwarth

-----Ursprüngliche Nachricht-----
Von: Öberg Mats [mailto:mats.oberg@tietoenator.com]
Gesendet: Mittwoch, 11. Dezember 2002 14:20
An: toasters@mathworks.com
Betreff: file folding impact

Hi, I'm thinking about enabling file folding on one of our filers.
Have anybody tried this? If so, what kind of performance decrease/size gain
can be expected from enabling it?

---- Mats

RE: file folding impact [ In reply to ]

Chuck.Tomasi at plexus

Dec 12, 2002, 7:32 AM

Post #5 of 12 (4109 views)

Permalink

My question about file folding (after reading about it here and on NOW)...

Why would a block end up in a snapshot if it is the same as the previous
snapshot? Isn't the point of snapshots only to save the 'old version' of
blocks that have been changed?

OK, second question, does it only check the most recent snapshot or does it
drill through all existing snaps?

Now that I think more on the subject, third question... what happens to the
snapshot "n" that references blocks in snap "n+1" when "n+1" rolls off the
end and expires?

--Chuck

-----Original Message-----
From: Mitko Blazeski [mailto:Mitko.Blazeski@proact.se]
Sent: Thursday, December 12, 2002 3:10 AM
To: 'Kumar, Rahul'; 'toasters@mathworks.com'
Subject: RE: file folding impact

File folding describes the process of checking the data in the most recent
snapshot, and, if it is identical to the snapshot currently being created,
just referencing the previous snapshot instead of taking up disk space
writing the same data in the new snapshot. Disk space is saved by sharing
unchanged file blocks between the active version of the file and the version
of the file in the latest snapshot, if any.

/Mitko

-----Original Message-----
From: Kumar, Rahul [mailto:rahul.kumar@eds.com]
Sent: den 12 december 2002 05:19
To: stefan.holzwarth@adac.de; mats.oberg@tietoenator.com;
toasters@mathworks.com
Subject: RE: file folding impact

BTW what does this option do

Rahul

-----Original Message-----
From: stefan.holzwarth@adac.de [mailto:stefan.holzwarth@adac.de]
Sent: Wednesday, December 11, 2002 7:08 PM
To: mats.oberg@tietoenator.com; toasters@mathworks.com
Subject: AW: file folding impact

Since 2 months we use filefolding on all volumes. (F760,2TB,3000 User, only
cifs) No remarkable impact in cpu usage. No trouble. Snapshotsize decreasing
a little. Regards Stefan Holzwarth

-----Ursprüngliche Nachricht-----
Von: Öberg Mats [mailto:mats.oberg@tietoenator.com]
Gesendet: Mittwoch, 11. Dezember 2002 14:20
An: toasters@mathworks.com
Betreff: file folding impact

Hi, I'm thinking about enabling file folding on one of our filers.
Have anybody tried this? If so, what kind of performance decrease/size gain
can be expected from enabling it?

---- Mats

RE: file folding impact [ In reply to ]

brilong at cisco

Dec 12, 2002, 8:16 AM

Post #6 of 12 (4121 views)

Permalink

> Why would a block end up in a snapshot if it is the same as the previous
> snapshot? Isn't the point of snapshots only to save the 'old version' of
> blocks that have been changed?

Chuck,

Microsoft designed CIFS, in their infinite wisdom, such that it rewrites
the entire file instead of updating changed blocks. Netapp Ontap
detects this and updates the changed blocks only. This feature is
proprietary to Netapp and is meant to save a lot of snapshot space.

Sorry I can't answer your other questions.

/Brian/
--
Brian Long | | |
Americas IT Hosting Sys Admin | .|||. .|||.
Phone: (919) 392-7363 | ..:|||||||:...:|||||||:..
Pager: (888) 651-2015 | C i s c o S y s t e m s

Re: file folding impact [ In reply to ]

scl at sasha

Dec 12, 2002, 2:37 PM

Post #7 of 12 (4114 views)

Permalink

> My question about file folding (after reading about it here and on NOW)...
>
> Why would a block end up in a snapshot if it is the same as the previous
> snapshot? Isn't the point of snapshots only to save the 'old version' of
> blocks that have been changed?

Snapshots do not copy any blocks, so including the same block in
multiple snapshots incurs no overhead. It works like this. Each
data block in a volume has associated with it a bit map of length 32
(20 on older releases) These bits correspond to snapshots. For example,
if bit #3 is set, then the block is part of snapshot #3. The filer
keeps track of which logical name (eg "hourly.0") is associated with each
bit. The size of the bit map limits the number of snapshots that can
exist in a volume.

Here's what happens when you create a snapshot. The filer picks an
unused snapshot number and sets the corresponding bit for all blocks
currently in use by the live filesystem. Note that many of these
blocks may have had other bits set when other snapshots were
created previously. So a block can be a member of multiple snapshots.
It may even be a member of all snapshots. This is not unusual at all.
If a file is older than your oldest snapshot, then the blocks that
comprise the file are members of all snapshots.

When a snapshot is deleted, the bit corresponding to the snapshot
is cleared for each block that has the bit set. If a block is left
with no bits set and if the block is not part of the live filesystem,
then it is freed for reuse.

So long as a block is a member of any snapshot, it cannot be modified,
so it cannot be freed for reuse. The only way to get that block back
is to delete all the snapshots that it belongs to.

You may wonder how you can possibly modify a file once it has been
snapshotted because you can't modify any of its blocks. The filer
simply makes changes in new data blocks and links them into the file
in place of the snapshotted blocks. The snapshotted blocks are left
untouched. However, they are no longer part of the live filesystem.

So when you look at a snapshot of a file, you are not looking at a copy.
You are looking at the actual data blocks that comprised the file when
the snapshot was made. In fact, the entire volume is treated this way
including the directories, inodes, etc. So when you look at a snapshot,
you are looking at the actual blocks that comprised the volume at the
moment the snapshot was taken. File permissions, owner, group, timestamps,
etc., are all frozen in time. (This can be a security issue. If a
sensitive file has been left world readable and you change the permissions
to protect it, you have to remember that any snapshotted copy of the file
still has the wrong permissions, so anyone can still read the file
in the snapshot. Your only option is to delete all snapshots where the
file is world readable.)

>
> OK, second question, does it only check the most recent snapshot or does it
> drill through all existing snaps?

Don't know.

>
> Now that I think more on the subject, third question... what happens to the
> snapshot "n" that references blocks in snap "n+1" when "n+1" rolls off the
> end and expires?

Don't think about what happens to the snapshot, think about what happens
to the blocks. If a block is a member of snapshots n and n+1, then the
block has at least two bits set (n and n+1). When n+1 expires, bit
n+1 is cleared for all blocks where it is set. Since bit n is still set,
the block is not freed, leaving snapshot n still intact.

Steve Losen scl@virginia.edu phone: 434-924-0640

University of Virginia ITC Unix Support

RE: file folding impact [ In reply to ]

iso9 at jwiz

Dec 12, 2002, 3:59 PM

Post #8 of 12 (4109 views)

Permalink

> -----Original Message-----
> From: owner-toasters@mathworks.com
> [mailto:owner-toasters@mathworks.com]On Behalf Of Steve Losen
> Sent: Thursday, December 12, 2002 1:37 PM
> To: Chuck Tomasi
> Cc: 'Mitko Blazeski'; 'Kumar, Rahul'; 'toasters@mathworks.com'
> Subject: Re: file folding impact
>
>
> > My question about file folding (after reading about it here and
> on NOW)...
> >
> > Why would a block end up in a snapshot if it is the same as the previous
> > snapshot? Isn't the point of snapshots only to save the 'old
> version' of
> > blocks that have been changed?
>
> Snapshots do not copy any blocks, so including the same block in
> multiple snapshots incurs no overhead. It works like this. Each
> data block in a volume has associated with it a bit map of length 32
> (20 on older releases) These bits correspond to snapshots. For example,
> if bit #3 is set, then the block is part of snapshot #3. The filer
> keeps track of which logical name (eg "hourly.0") is associated with each
> bit. The size of the bit map limits the number of snapshots that can
> exist in a volume.

[snip...]

Ok, all of that makes sense to me. But I am still unclear on what
filefolding does exactly.

This explanation fits in well with how I think, so if someone could present
the basic concept of filefolding in a similar matter, I would very much
appreciate it. :)

Thanks,
Jordan

Re: file folding impact [ In reply to ]

TCOOK at joy

Dec 12, 2002, 4:47 PM

Post #9 of 12 (4105 views)

Permalink

As I understand it, with windoze systems, when you modify a file, you completely rewrite it as a new file. Those blocks that belonged to the "old" file are maintained in the snapshot inode listing. This is not an issue with Unix (As I understand it). So with file folding, the contents of the active file system are compared against the most recent snapshot (only) to determine if blocks are duplicates and can be freed. Also, small files (less than 64K, which really are written to the inode) NT streams and directories are not folded.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
This electronic mail transmission contains information from Joy Mining Machinery
which is confidential, and is intended only for the use of the proper addressee.
If you are not the intended recipient, please notify us immediately at the return
address on this transmission, or by telephone at (724) 779-4500, and delete this
message and any attachments from your system. Unauthorized use, copying,
disclosing, distributing, or taking any action in reliance on the contents of this
transmission is strictly prohibited and may be unlawful.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Re: file folding impact [ In reply to ]

scl at sasha

Dec 12, 2002, 4:51 PM

Post #10 of 12 (4103 views)

Permalink

> > -----Original Message-----
> > From: owner-toasters@mathworks.com
> > [mailto:owner-toasters@mathworks.com]On Behalf Of Steve Losen
> > Sent: Thursday, December 12, 2002 1:37 PM
> > To: Chuck Tomasi
> > Cc: 'Mitko Blazeski'; 'Kumar, Rahul'; 'toasters@mathworks.com'
> > Subject: Re: file folding impact
> >
> >
> > > My question about file folding (after reading about it here and
> > on NOW)...
> > >
> > > Why would a block end up in a snapshot if it is the same as the previous
> > > snapshot? Isn't the point of snapshots only to save the 'old
> > version' of
> > > blocks that have been changed?
> >
> > Snapshots do not copy any blocks, so including the same block in
> > multiple snapshots incurs no overhead. It works like this. Each
> > data block in a volume has associated with it a bit map of length 32
> > (20 on older releases) These bits correspond to snapshots. For example,
> > if bit #3 is set, then the block is part of snapshot #3. The filer
> > keeps track of which logical name (eg "hourly.0") is associated with each
> > bit. The size of the bit map limits the number of snapshots that can
> > exist in a volume.
>
> [snip...]
>
> Ok, all of that makes sense to me. But I am still unclear on what
> filefolding does exactly.
>
> This explanation fits in well with how I think, so if someone could present
> the basic concept of filefolding in a similar matter, I would very much
> appreciate it. :)

You had to ask ... :-)

When a client modifies a file, it often does it in a way that replaces
the entire file, by writing a whole new file from start to finish. This
is particularly true of editors and word processors. Let us suppose that
the old file is in a snapshot. When the client writes the new file, it
cannot overwrite the original file because its blocks are in a snapshot.
So the new file is written to freshly allocated blocks and the old blocks
are removed from the live filesystem, but remain in the snapshot of course.
If the modification to the file is very small, such as adding a few
sentences to the end of a large document, you end up duplicating a
lot of data. Up to the point where you added to the end of the document,
the blocks in the snapshot and the blocks in the live file contain
duplicate data. File folding detects this and "stitches" the old
snapshotted blocks back into the live file, and frees the freshly
allocated blocks. That way small changes to large snapshotted files
don't consume so much disk space.

I don't know how clever or aggressive file folding is -- the more
thoroughly it looks for duplicated blocks, the more space it will
recover, but the more CPU it will consume. Because snapshots must
be preserved intact, you can only fold two blocks that have identical
data. If you add a single byte to the beginning of a text file, you
completely throw off the original block boundaries, making folding
impossible.

Steve Losen scl@virginia.edu phone: 434-924-0640

University of Virginia ITC Unix Support

RE: file folding impact [ In reply to ]

iso9 at jwiz

Dec 12, 2002, 5:13 PM

Post #11 of 12 (4104 views)

Permalink

> -----Original Message-----
> From: owner-toasters@mathworks.com
> [mailto:owner-toasters@mathworks.com]On Behalf Of Timothy Cook
> Sent: Thursday, December 12, 2002 3:47 PM
> To: toasters@mathworks.com
> Subject: Re: file folding impact
>
>
> As I understand it, with windoze systems, when you modify a file,
> you completely rewrite it as a new file. Those blocks that
> belonged to the "old" file are maintained in the snapshot inode
> listing. This is not an issue with Unix (As I understand it). So
> with file folding, the contents of the active file system are
> compared against the most recent snapshot (only) to determine if
> blocks are duplicates and can be freed. Also, small files (less
> than 64K, which really are written to the inode) NT streams and
> directories are not folded.

It must be dependent on the application doing the writing. We are primarily
using our filer to hold Domino databases. Many of these are over half a
gig. They definitely get changed all the time, and if it were rewriting the
entire file, it seems like we would be using much more snapshot than we are,
not to mention the time it would take to rewrite the entire file.

Jordan

Re: file folding impact [ In reply to ]

scl at sasha

Dec 13, 2002, 8:19 AM

Post #12 of 12 (4114 views)

Permalink

> > -----Original Message-----
> > From: owner-toasters@mathworks.com
> > [mailto:owner-toasters@mathworks.com]On Behalf Of Timothy Cook
> > Sent: Thursday, December 12, 2002 3:47 PM
> > To: toasters@mathworks.com
> > Subject: Re: file folding impact
> >
> >
> > As I understand it, with windoze systems, when you modify a file,
> > you completely rewrite it as a new file. Those blocks that
> > belonged to the "old" file are maintained in the snapshot inode
> > listing. This is not an issue with Unix (As I understand it). So
> > with file folding, the contents of the active file system are
> > compared against the most recent snapshot (only) to determine if
> > blocks are duplicates and can be freed. Also, small files (less
> > than 64K, which really are written to the inode) NT streams and
> > directories are not folded.
>
> It must be dependent on the application doing the writing. We are primarily
> using our filer to hold Domino databases. Many of these are over half a
> gig. They definitely get changed all the time, and if it were rewriting the
> entire file, it seems like we would be using much more snapshot than we are,
> not to mention the time it would take to rewrite the entire file.

Very true. Database applications tend to update individual file blocks
rather than rewrite the entire file. File folding isn't necessary.

For file folding to be useful, your application must

1) rewrite the original file rather than update blocks "in place".

2) the original file and the new file must have blocks with
identical data so that they can be folded.

Steve Losen scl@virginia.edu phone: 434-924-0640

University of Virginia ITC Unix Support