Mailing List Archive

tar exclude files question
Every so often I tar up my home directory on my main machine, and push
it over to my "hot backup" machine, and then do a tap-dance with the
.ssh directory. I notice oodles of cache files being tarred. Do I
understand the man page correctly about the CACHEDIR.TAG "magic file"?
Assume I have a .cache directory like so...

[x8940][waltdnes][~] ll .cache
total 64
drwx------ 12 waltdnes users 4096 Sep 5 10:50 .
drwxr-xr-x 141 waltdnes users 20480 Sep 5 10:58 ..
-rw-r--r-- 1 waltdnes users 0 Sep 5 10:50 CACHEDIR.TAG
drwx------ 2 waltdnes users 4096 Aug 29 15:34 babl
drwxr-xr-x 2 waltdnes users 4096 Sep 5 08:52 fontconfig
drwxr-xr-x 3 waltdnes users 4096 Jun 11 2021 geeqie
drwx------ 3 waltdnes users 4096 May 29 2021 gegl-0.4
drwxr-xr-x 3 waltdnes users 4096 May 29 2021 gimp
drwx------ 3 waltdnes users 4096 May 27 2021 google-chrome
drwx------ 2 waltdnes users 4096 Sep 5 10:23 mc
drwxr-xr-x 258 waltdnes users 4096 Mar 24 2022 mesa_shader_cache
drwx------ 3 waltdnes users 4096 May 25 2021 'moonchild productions'
drwx------ 3 waltdnes users 4096 Nov 16 2021 thumbnails

Would a script in /home like...

#!/bin/bash
tar --exclude-caches-under cvzf wdexport.tgz waltdnes

...skip files in that directory? I don't mind a few empty directories.

--
I've seen things, you people wouldn't believe; Gopher, Netscape with
frames, the first Browser Wars. Searching for pages with AltaVista,
pop-up windows self-replicating, trying to uninstall RealPlayer. All
those moments, will be lost in time like tears in rain... time to die.
Re: tar exclude files question [ In reply to ]
On Tue, Sep 5, 2023 at 5:28?PM Walter Dnes <waltdnes@waltdnes.org> wrote:

> Every so often I tar up my home directory on my main machine, and push
> it over to my "hot backup" machine, and then do a tap-dance with the
> .ssh directory. I notice oodles of cache files being tarred. Do I
> understand the man page correctly about the CACHEDIR.TAG "magic file"?
> Assume I have a .cache directory like so...
>
> [x8940][waltdnes][~] ll .cache
> total 64
> drwx------ 12 waltdnes users 4096 Sep 5 10:50 .
> drwxr-xr-x 141 waltdnes users 20480 Sep 5 10:58 ..
> -rw-r--r-- 1 waltdnes users 0 Sep 5 10:50 CACHEDIR.TAG
> drwx------ 2 waltdnes users 4096 Aug 29 15:34 babl
> drwxr-xr-x 2 waltdnes users 4096 Sep 5 08:52 fontconfig
> drwxr-xr-x 3 waltdnes users 4096 Jun 11 2021 geeqie
> drwx------ 3 waltdnes users 4096 May 29 2021 gegl-0.4
> drwxr-xr-x 3 waltdnes users 4096 May 29 2021 gimp
> drwx------ 3 waltdnes users 4096 May 27 2021 google-chrome
> drwx------ 2 waltdnes users 4096 Sep 5 10:23 mc
> drwxr-xr-x 258 waltdnes users 4096 Mar 24 2022 mesa_shader_cache
> drwx------ 3 waltdnes users 4096 May 25 2021 'moonchild productions'
> drwx------ 3 waltdnes users 4096 Nov 16 2021 thumbnails
>
> Would a script in /home like...
>
> #!/bin/bash
> tar --exclude-caches-under cvzf wdexport.tgz waltdnes
>
> ...skip files in that directory? I don't mind a few empty directories.
>

Try it and see. My concern is that the man page implies that with
--exclude-caches-under the subdirectories are excluded recursively, but the
directory with the file called CACHEDIR.TAG is not.
I'm sure that's wrong but the man page says what it says.

--
Alan McKinnon
alan dot mckinnon at gmail dot com
Re: tar exclude files question [ In reply to ]
On Tuesday, 5 September 2023 16:28:31 BST Walter Dnes wrote:
> Every so often I tar up my home directory on my main machine, and push
> it over to my "hot backup" machine, and then do a tap-dance with the
> .ssh directory. I notice oodles of cache files being tarred. Do I
> understand the man page correctly about the CACHEDIR.TAG "magic file"?
> Assume I have a .cache directory like so...
>
> [x8940][waltdnes][~] ll .cache
> total 64
> drwx------ 12 waltdnes users 4096 Sep 5 10:50 .
> drwxr-xr-x 141 waltdnes users 20480 Sep 5 10:58 ..
> -rw-r--r-- 1 waltdnes users 0 Sep 5 10:50 CACHEDIR.TAG
> drwx------ 2 waltdnes users 4096 Aug 29 15:34 babl
> drwxr-xr-x 2 waltdnes users 4096 Sep 5 08:52 fontconfig
> drwxr-xr-x 3 waltdnes users 4096 Jun 11 2021 geeqie
> drwx------ 3 waltdnes users 4096 May 29 2021 gegl-0.4
> drwxr-xr-x 3 waltdnes users 4096 May 29 2021 gimp
> drwx------ 3 waltdnes users 4096 May 27 2021 google-chrome
> drwx------ 2 waltdnes users 4096 Sep 5 10:23 mc
> drwxr-xr-x 258 waltdnes users 4096 Mar 24 2022 mesa_shader_cache
> drwx------ 3 waltdnes users 4096 May 25 2021 'moonchild productions'
> drwx------ 3 waltdnes users 4096 Nov 16 2021 thumbnails
>
> Would a script in /home like...
>
> #!/bin/bash
> tar --exclude-caches-under cvzf wdexport.tgz waltdnes
>
> ...skip files in that directory? I don't mind a few empty directories.

Have a look at this page which explains what you need to do:

https://bford.info/cachedir/

Alternatively, use something like:

--exclude=".cache/*"

for top level .cache directories.
Re: tar exclude files question [ In reply to ]
Michael wrote:
> On Tuesday, 5 September 2023 16:28:31 BST Walter Dnes wrote:
>> Every so often I tar up my home directory on my main machine, and push
>> it over to my "hot backup" machine, and then do a tap-dance with the
>> .ssh directory. I notice oodles of cache files being tarred. Do I
>> understand the man page correctly about the CACHEDIR.TAG "magic file"?
>> Assume I have a .cache directory like so...
>>
>> [x8940][waltdnes][~] ll .cache
>> total 64
>> drwx------ 12 waltdnes users 4096 Sep 5 10:50 .
>> drwxr-xr-x 141 waltdnes users 20480 Sep 5 10:58 ..
>> -rw-r--r-- 1 waltdnes users 0 Sep 5 10:50 CACHEDIR.TAG
>> drwx------ 2 waltdnes users 4096 Aug 29 15:34 babl
>> drwxr-xr-x 2 waltdnes users 4096 Sep 5 08:52 fontconfig
>> drwxr-xr-x 3 waltdnes users 4096 Jun 11 2021 geeqie
>> drwx------ 3 waltdnes users 4096 May 29 2021 gegl-0.4
>> drwxr-xr-x 3 waltdnes users 4096 May 29 2021 gimp
>> drwx------ 3 waltdnes users 4096 May 27 2021 google-chrome
>> drwx------ 2 waltdnes users 4096 Sep 5 10:23 mc
>> drwxr-xr-x 258 waltdnes users 4096 Mar 24 2022 mesa_shader_cache
>> drwx------ 3 waltdnes users 4096 May 25 2021 'moonchild productions'
>> drwx------ 3 waltdnes users 4096 Nov 16 2021 thumbnails
>>
>> Would a script in /home like...
>>
>> #!/bin/bash
>> tar --exclude-caches-under cvzf wdexport.tgz waltdnes
>>
>> ...skip files in that directory? I don't mind a few empty directories.
> Have a look at this page which explains what you need to do:
>
> https://bford.info/cachedir/
>
> Alternatively, use something like:
>
> --exclude=".cache/*"
>
> for top level .cache directories.


If it helps any, that looks like the way rsync does that as well.  It
could be they use the same coding even tho they do different things.  It
would be funny if the same person helps code both.  lol

Just thought it worth a mention. 

Dale

:-)  :-)
Re: tar exclude files question [ In reply to ]
On Tue, Sep 05, 2023 at 07:38:54PM +0100, Michael wrote
>
> Have a look at this page which explains what you need to do:
>
> https://bford.info/cachedir/

Thank you! Thank you! Thank you! That page explains that any random
CACHEDIR.TAG file won't suffice, and why my attempts were all failing.

This file must be an ordinary file, not for example a symbolic link.
Additionally, the first 43 octets of this file *MUST* *MUST* *MUST*
consist of the following ASCII header string:

Signature: 8a477f597d28d172789f06886806bc55

Otherwise it wil *NOT* work. This *NOT* mentioned in "man tar".
https://www.gnu.org/software/tar/manual/tar.html#index-cachedir mentions
http://www.brynosaurus.com/cachedir/spec.html in passing, which
redirects to https://bford.info/cachedir/

--
I've seen things, you people wouldn't believe; Gopher, Netscape with
frames, the first Browser Wars. Searching for pages with AltaVista,
pop-up windows self-replicating, trying to uninstall RealPlayer. All
those moments, will be lost in time like tears in rain... time to die.
Re: tar exclude files question [ In reply to ]
On Tue, Sep 05, 2023 at 08:36:00PM +0200, Alan McKinnon wrote
>
> Try it and see. My concern is that the man page implies that with
> --exclude-caches-under the subdirectories are excluded recursively,
> but the directory with the file called CACHEDIR.TAG is not. I'm sure
> that's wrong but the man page says what it says.

Starting off with...

[x8940][waltdnes][~] ll .cache
total 68
drwx------ 12 waltdnes users 4096 Sep 5 17:45 .
drwxr-xr-x 141 waltdnes users 20480 Sep 5 18:38 ..
-rw-r--r-- 1 waltdnes users 44 Sep 5 17:42 CACHEDIR.TAG
drwx------ 2 waltdnes users 4096 Aug 29 15:34 babl
drwxr-xr-x 2 waltdnes users 4096 Sep 5 08:52 fontconfig
drwxr-xr-x 3 waltdnes users 4096 Jun 11 2021 geeqie
drwx------ 3 waltdnes users 4096 May 29 2021 gegl-0.4
drwxr-xr-x 3 waltdnes users 4096 May 29 2021 gimp
drwx------ 3 waltdnes users 4096 May 27 2021 google-chrome
drwx------ 2 waltdnes users 4096 Sep 5 10:23 mc
drwxr-xr-x 258 waltdnes users 4096 Mar 24 2022 mesa_shader_cache
drwx------ 3 waltdnes users 4096 Sep 5 17:12 'moonchild productions'
drwx------ 3 waltdnes users 4096 Nov 16 2021 thumbnails

tar --exclude-caches -cvzf x1.tgz .cache

...archives the following...

drwx------ waltdnes/users 0 2023-09-05 17:45 .cache/
-rw-r--r-- waltdnes/users 44 2023-09-05 17:42 .cache/CACHEDIR.TAG

tar --exclude-caches-all -cvzf x2.tgz .cache

...archives absolutely nothing; zip... zilch... nada

tar --exclude-caches-under -cvzf x3.tgz .cache

...archives...

drwx------ waltdnes/users 0 2023-09-05 17:45 .cache/

I'll gladly take any of them. One more thing; in this mode, you
*MUST* use a leading minus for "-cvzf". The lazy "cvzf" will *NOT*
work, and throws a misleading error message. See also my post to
Michael about the contents of the CACHEDIR.TAG file.

--
I've seen things, you people wouldn't believe; Gopher, Netscape with
frames, the first Browser Wars. Searching for pages with AltaVista,
pop-up windows self-replicating, trying to uninstall RealPlayer. All
those moments, will be lost in time like tears in rain... time to die.
Re: tar exclude files question [ In reply to ]
On Tuesday, 5 September 2023 23:32:04 BST Walter Dnes wrote:
> On Tue, Sep 05, 2023 at 07:38:54PM +0100, Michael wrote
>
> > Have a look at this page which explains what you need to do:
> >
> > https://bford.info/cachedir/
>
> Thank you! Thank you! Thank you! That page explains that any random
> CACHEDIR.TAG file won't suffice, and why my attempts were all failing.
>
> This file must be an ordinary file, not for example a symbolic link.
> Additionally, the first 43 octets of this file *MUST* *MUST* *MUST*
> consist of the following ASCII header string:
>
> Signature: 8a477f597d28d172789f06886806bc55
>
> Otherwise it wil *NOT* work. This *NOT* mentioned in "man tar".
> https://www.gnu.org/software/tar/manual/tar.html#index-cachedir mentions
> http://www.brynosaurus.com/cachedir/spec.html in passing, which
> redirects to https://bford.info/cachedir/

IKR, it's as if the usage of this mechanism is meant to remain secret and a
test of the patience and detective skills of the user. This is why I
suggested using --exclude=".cache/*" which works the same way - as long as the
cache directory you want to exclude is named ".cache".

The use of a CACHEDIR.TAG file works for any directory you want to exclude
from the backup, no matter what it is named. If you have a lot of directories
you always want to exclude, then adding the CACHEDIR.TAG file in each of them
is a one time action. Better than having to type all the exclude directives
in the CLI invocation of tar.

On the other hand, using --exclude=".cache/*" will catch any and all ".cache"
directories, wherever they happen to be in the tree.
Re: tar exclude files question [ In reply to ]
On Wednesday, 6 September 2023 00:10:14 BST Walter Dnes wrote:

> I'll gladly take any of them. One more thing; in this mode, you
> *MUST* use a leading minus for "-cvzf". The lazy "cvzf" will *NOT*
> work, and throws a misleading error message. See also my post to
> Michael about the contents of the CACHEDIR.TAG file.

Ah, yes, the GNU Vs Unix syntax of tar options. I tend to use the "-" prefix
on options, because I find the GNU style syntax can be quite nuanced. Search
the man page for "Archive format selection" to get an idea.
Re: tar exclude files question [ In reply to ]
On Wed, Sep 06, 2023 at 10:48:29AM +0100, Michael wrote
>
> IKR, it's as if the usage of this mechanism is meant to remain secret
> and a test of the patience and detective skills of the user.

I've filed a documentation bug with bug-tar@gnu.org We'll see what
happens.

--
I've seen things, you people wouldn't believe; Gopher, Netscape with
frames, the first Browser Wars. Searching for pages with AltaVista,
pop-up windows self-replicating, trying to uninstall RealPlayer. All
those moments, will be lost in time like tears in rain... time to die.