Mailing List Archive

1 2 3  View All
Re: Re: Suggestions for backup scheme? [ In reply to ]
On Tuesday, February 6, 2024 6:22:34 PM CET Grant Edwards wrote:
> On 2024-02-06, J. Roeleveld <joost@antarean.org> wrote:
> > On Tuesday, February 6, 2024 4:38:11 PM CET Grant Edwards wrote:

> >> I presume that boot/root on ext4 and home on ZFS would not require an
> >> initrd?
> >
> > Yes, that wouldn't require an initrd. But why would you limit this?
>
> Because I really, really dislike having to use an initrd. That's
> probably just an irrational 30 year old prejudice, but over the
> decades I've found live to be far simpler and more pleasant without
> initrds. Maybe things have improved over the years, but way back when
> I did use distros that required initrds, they seem to be a constant,
> nagging source of headaches.

In the past, initrd's were a nightmare. Even the current tools (dracut,
genkernel) are a pain and force the user to do it their way.
The only initramfs generator I use is the "bliss-initramfs" one and that is
because it actually works and doesn't get in the way.
And I don't build a new kernel for the server.

For my desktops and laptops, I embed the initramfs into the kernel using a
very simple set of files (script with the commands and a config detailing which
files to include)
the total size of both files is about 8K and was mostly grabbed from a howto
page about 10 years ago and has stayed unchanged since then. (I added a little
script to update the config when library versions change, but that is it)

> > ZFS works best when given the FULL drive.
>
> Where do you put swap?

My swap is a ZFS volume. I find using the recommended method of configuring it
is safe and I have not seen any kind of lockup due to swap.
Did have some due to a bug in the HBA-driver when some deranged dev decided to
change sensible defaults though. But it would freeze before even getting to
enabling swap.

> > For my server, I use "bliss-initramfs" to generate the initramfs and
> > have not had any issues with this since I started using ZFS.
> >
> > Especially the ease of generating snapshots also make it really easy
> > to roll back an update if anything went wrong. If your
> > root-partition isn't on ZFS, you can't easily roll back.
>
> True. However, I've never adopted the practice of backing up my root
> fs (except for a few specific directories like /etc), and haven't ever
> really run into situations where I wished I had. It's all stuff that
> can easily be reinstalled.

I did start backup up the full system as restoring from backup (especially
rolling back a snapshot, but same is true when grabbing the backup from tape)
is a lot faster than reinstalling all the software and making sure the config
(which these days isn't just in /etc anymore) is still the same.

--
Joost
Re: Re: Suggestions for backup scheme? [ In reply to ]
On 07/02/2024 11:07, J. Roeleveld wrote:
>> Because snapshotting uses so much less space?
>>
>> So much so that, for normal usage, I probably have no need to delete any
>> snapshots, for YEARS?
> My comment was based on using rsync to copy from the source to the backup
> filesystem.

Well, that's EXACTLY what I'm doing too. NO DIFFERENCE. Actually, there
is a minor difference - because I'm using lvm, I'm also using rsync's
"overwrite in place" switch. In other words, it compares source and
destination *in*place*, and if any block has changed, it overwrites the
change, rather than creating a complete new copy.

Because lvm is COW, that means I have two copies of the file, in two
different snapshots, but inasmuch as the files are identical, there's
only one copy of the identical bits.
>
>> Okay, space is not an expensive commodity, and you don't want too many
>> snapshots, simply because digging through all those snapshots would be a
>> nightmare, but personally I wouldn't use a crude rsync simply because I
>> prefer to be frugal in my use of resources.

> What is "too many"?
> I currently have about 1800 snapshots on my server. Do have a tool that
> ensures it doesn't get out of hand and will remove several over time.
>
"Too many" is whatever you define it to be. I'm likely to hang on to my
/home snapshots for yonks. My / snapshots, on the other hand, I delete
anything more than a couple of months old.

If I can store several years of /home snapshots without running out of
space, why shouldn't I? The problem, if I *am* running out of space, I'm
going to have to delete a *lot* of snapshots to make much difference...

Cheers,
Wol
Re: Re: Suggestions for backup scheme? [ In reply to ]
On 07/02/2024 11:11, J. Roeleveld wrote:
> On Tuesday, February 6, 2024 9:27:35 PM CET Wols Lists wrote:
>> On 06/02/2024 13:12, J. Roeleveld wrote:
>>>> Clearly Oracle likes this state of affairs. Either that, or they are
>>>> encumbered in some way from just GPLing the ZFS code. Since they on
>>>> paper own the code for both projects it seems crazy to me that this
>>>> situation persists.
>>>
>>> GPL is not necessarily the best license for releasing code. I've got some
>>> private projects that I could publish. But before I do that, I'd have to
>>> decide on a License. I would prefer something other than GPL.
>>
>> Okay. What do you want to achieve. Let's just lump licences into two
>> categories to start with and ask the question "Who do you want to free?"
>
> I want my code to be usable by anyone, but don't want anyone to fork it and
> start making money off of it without giving me a fair share.

Okay, that instantly says you want a copyleft licence. So you're stuck
with a GPL-style licence, and if they want to include it in a commercial
closed source product, they need to come back to you and dual licence it.

Personally, I'd go the MPL2 route, but that's my choice. It might not
suit you. But to achieve what you want, you need a copyleft, GPL-style
licence.
>
>> If that sounds weird, it's because both Copyleft and Permissive claim to
>> be free, but have completely different target audiences. Once you've
>> answered that question, it'll make choosing a licence so much easier.
>>
>> GPL gives freedom to the END USER. It's intended to protect the users of
>> your program from being held to ransom.
>
> That's not how the kernel devs handle the GPL. They use it to remove choice
> from the end user (me) to use what I want (ZFS).
> And it's that which I don't like about the GPL.
>
No. That's Oracle's fault. The kernel devs can't include ZFS in linux,
because Oracle (or rather Sun, at the time, I believe) deliberately
*designed* the ZFS licence to be incompatible with the GPL.

After all, there's nothing stopping *you* from combining Linux and ZFS,
it's just that somebody else can't do that for you, and then give you
the resulting binary.

At the end of the day, if someone wants to be an arsehole, there's not a
lot you can do to stop them, and with ZFS that honour apparently goes to
Sun.

Cheers,
Wol
Re: Suggestions for backup scheme? [ In reply to ]
Am Tue, Jan 30, 2024 at 06:15:09PM -0000 schrieb Grant Edwards:
> I need to set up some sort of automated backup on a couple Gentoo
> machines (typical desktop software development and home use). One of
> them used rsnapshot in the past but the crontab entries that drove
> that have vanished :/ (presumably during a reinstall or upgrade --
> IIRC, it took a fair bit of trial and error to get the crontab entries
> figured out).
>
> I believe rsnapshot ran nightly and kept daily snapshots for a week,
> weekly snapshots for a month, and monthly snapshots for a couple
> years.
>
> Are there other backup solutions that people would like to suggest I
> look at to replace rsnapshot? I was happy enough with rsnapshot (when
> it was running), but perhaps there's something else I should consider?

In my early backup times I, too, used rsnapshot to back up my ~ and rsync
for my big media files. But that only included my PC. My laptop was wholly
un-backed-up. I only syncronised much of my home and my audio collection
between the two with unison. At some point my external 3 TB drive became
free and then I started using borg to finally do proper backups.

Borg is very similar to restic, I actually used the two in parallel for a
while to compare them, but stayed with borg. One pain point was that I
couln’t switch off restic’s own password protection. Since all my backup
disks are LUKSed anyway, I don’t need that.

Since borg works block-based, it does deduplication without extra cost and
it is suitable for big image files which don’t change much. I do full
filesystem backups of /, ~ and my media partition of my main PC and my
laptop. I have one repository for each of those three filesystems, and each
repo receives the data from both machines, so they are deduped. Since both
machines run Arch, their roots are binary identical. The same goes for my
unison-synced homes.

Borg has retention logic built-in. You can say I want to keep the latest
archive of each of the last 6 days/weeks/months/years, and it even goes down
to seconds. And of course you can combine those rules. The only thing is
they don’t overlap, meaning if you want to keep the last 14 days and the
last four weeks, those weekly retentions start after the last daily
snapshots.

In summary, advantages:
+ fast dedup, built-in compression (different algos and levels configurable)
+ big data files allow for quick mirroring of repositories.
I simply rsync my primary backup disk to two other external HDDs.
+ Incremental backups are quite fast because borg uses a cache to detect
changed files quickly.
Disadvantages:
- you need borg to mount the backups it
- it is not as fast as native disk access, especially during restore and
when getting a total file listing due to lots of random I/O on the HDD.


As example, I currently have 63 snapshots in my data partition repository:

# borg list data/
tp_2021-06-07 Mon, 2021-06-07 16:27:44 [5f9ebd9f24353c340691b2a71f5228985a41699d2e23473ae4e9e795669c8440]
kern_2021-06-07 Mon, 2021-06-07 23:58:56 [19c76211a9c35432e6a66ac1892ee19a08368af28d2d621f509af3d45f203d43]
[... 55 more lines ...]
kern_2024-01-14 Sun, 2024-01-14 20:53:23 [499ce7629e64cffb7ec6ec9ffbf0c595e4ede3d93f131a9a4b424b165647f645]
tp_2024-01-14 Sun, 2024-01-14 20:57:42 [ea2baef3e4bb49c5aec7cf8536f7b00b55fb27ecae3a80ef9f5a5686a1da30d5]
kern_2024-01-21 Sun, 2024-01-21 23:42:46 [71aa2ce6cf4021712f949af068498bfda7797b5d1c5ddc0f0ce8862b89e48961]
tp_2024-01-21 Sun, 2024-01-21 23:48:24 [45e35ed9206078667fa62d0e4a1ac213e77f52415f196101d14ee21e79fc393d]
kern_2024-02-04 Sun, 2024-02-04 23:16:43 [e1b015117143fad6b89cea66329faa888cffc990644e157b1d25846220c62448]
tp_2024-02-04 Sun, 2024-02-04 23:23:15 [e9b167ceec1ab9a80cbdb1acf4ff31cd3935fc23e81674cad1b8694d98547aeb]

The last “tp” (Thinkpad) snapshot contains 1 TB, “kern” (my PC) 809 GB.
And here you see how much space this actually takes on disk:

# borg info data/
[ ... ]
Original size Compressed size Deduplicated size
All archives: 56.16 TB 54.69 TB 1.35 TB

Obviously, compression doesn’t do much for media files. But it is very
effective in the repository for the root partitions:

# borg info arch-root/
[ ... ]
Original size Compressed size Deduplicated size
All archives: 1.38 TB 577.58 GB 79.41 GB

--
Grüße | Greetings | Salut | Qapla’
Please do not share anything from, with or about me on any social network.

“She understands. She doesn’t comprehend.” – River Tam, Firefly
Re: Suggestions for backup scheme? [ In reply to ]
On 8/2/24 06:36, Frank Steinmetzger wrote:
> Am Tue, Jan 30, 2024 at 06:15:09PM -0000 schrieb Grant Edwards:
>> I need to set up some sort of automated backup on a couple Gentoo
>> machines (typical desktop software development and home use). One of
>> them used rsnapshot in the past but the crontab entries that drove
>> that have vanished :/ (presumably during a reinstall or upgrade --
>> IIRC, it took a fair bit of trial and error to get the crontab entries
>> figured out).
>>
>> I believe rsnapshot ran nightly and kept daily snapshots for a week,
>> weekly snapshots for a month, and monthly snapshots for a couple
>> years.
>>
>> Are there other backup solutions that people would like to suggest I
>> look at to replace rsnapshot? I was happy enough with rsnapshot (when
>> it was running), but perhaps there's something else I should consider?
> In my early backup times I, too, used rsnapshot to back up my ~ and rsync
> for my big media files. But that only included my PC. My laptop was wholly
> un-backed-up. I only syncronised much of my home and my audio collection
> between the two with unison. At some point my external 3 TB drive became
> free and then I started using borg to finally do proper backups.
>
> Borg is very similar to restic, I actually used the two in parallel for a
> while to compare them, but stayed with borg. One pain point was that I
> couln’t switch off restic’s own password protection. Since all my backup
> disks are LUKSed anyway, I don’t need that.
>
> Since borg works block-based, it does deduplication without extra cost and
> it is suitable for big image files which don’t change much. I do full
> filesystem backups of /, ~ and my media partition of my main PC and my
> laptop. I have one repository for each of those three filesystems, and each
> repo receives the data from both machines, so they are deduped. Since both
> machines run Arch, their roots are binary identical. The same goes for my
> unison-synced homes.
>
> Borg has retention logic built-in. You can say I want to keep the latest
> archive of each of the last 6 days/weeks/months/years, and it even goes down
> to seconds. And of course you can combine those rules. The only thing is
> they don’t overlap, meaning if you want to keep the last 14 days and the
> last four weeks, those weekly retentions start after the last daily
> snapshots.
>
> In summary, advantages:
> + fast dedup, built-in compression (different algos and levels configurable)
> + big data files allow for quick mirroring of repositories.
> I simply rsync my primary backup disk to two other external HDDs.
> + Incremental backups are quite fast because borg uses a cache to detect
> changed files quickly.
> Disadvantages:
> - you need borg to mount the backups it
> - it is not as fast as native disk access, especially during restore and
> when getting a total file listing due to lots of random I/O on the HDD.
>
>
> As example, I currently have 63 snapshots in my data partition repository:
>
> # borg list data/
> tp_2021-06-07 Mon, 2021-06-07 16:27:44 [5f9ebd9f24353c340691b2a71f5228985a41699d2e23473ae4e9e795669c8440]
> kern_2021-06-07 Mon, 2021-06-07 23:58:56 [19c76211a9c35432e6a66ac1892ee19a08368af28d2d621f509af3d45f203d43]
> [... 55 more lines ...]
> kern_2024-01-14 Sun, 2024-01-14 20:53:23 [499ce7629e64cffb7ec6ec9ffbf0c595e4ede3d93f131a9a4b424b165647f645]
> tp_2024-01-14 Sun, 2024-01-14 20:57:42 [ea2baef3e4bb49c5aec7cf8536f7b00b55fb27ecae3a80ef9f5a5686a1da30d5]
> kern_2024-01-21 Sun, 2024-01-21 23:42:46 [71aa2ce6cf4021712f949af068498bfda7797b5d1c5ddc0f0ce8862b89e48961]
> tp_2024-01-21 Sun, 2024-01-21 23:48:24 [45e35ed9206078667fa62d0e4a1ac213e77f52415f196101d14ee21e79fc393d]
> kern_2024-02-04 Sun, 2024-02-04 23:16:43 [e1b015117143fad6b89cea66329faa888cffc990644e157b1d25846220c62448]
> tp_2024-02-04 Sun, 2024-02-04 23:23:15 [e9b167ceec1ab9a80cbdb1acf4ff31cd3935fc23e81674cad1b8694d98547aeb]
>
> The last “tp” (Thinkpad) snapshot contains 1 TB, “kern” (my PC) 809 GB.
> And here you see how much space this actually takes on disk:
>
> # borg info data/
> [ ... ]
> Original size Compressed size Deduplicated size
> All archives: 56.16 TB 54.69 TB 1.35 TB
>
> Obviously, compression doesn’t do much for media files. But it is very
> effective in the repository for the root partitions:
>
> # borg info arch-root/
> [ ... ]
> Original size Compressed size Deduplicated size
> All archives: 1.38 TB 577.58 GB 79.41 GB
>
I would also like to add my +1 to borgbackup ... I long ago lost the
ability to use snapshots and full size backups due to the sheer amount
of data involved.  Currently I use borg to backup multiple hosts to
individual backups on a  dedicated machine (low power arm based, 6TB
drive).  I also backup from the top level of the directory all those
repos are stored in to another arm system (2TB drive) again using borg. 
As each 1st level backup only adds/changes a few chunks for each
iteration, the second level only takes minutes to run as against
30minutes or so for some of the individual hosts.  The second level adds
redundancy if I lose the 1st level backups, and the second can be
recreated at any time from the 1st level.  This is working for ~15 hosts
and VM's of various types involving hundreds of terabytes of original data.

Downside for VM's is that even a slight change to the image requires the
whole image to be read and check-summed to identify the changes to be
stored.  For images hundreds of gigabytes in size (on my
hardware/network) its actually quicker to mount and backup the internal
files (camera images in my case) than the VM image.

It is more complex than simple schemas, but I regularly restore from
both the first and second level backups for disaster
recovery/testing/rollbacks etc.  There is a management package
(borgmatic) but I have not tried it as use my own scripts.

BillK
Re: Re: Suggestions for backup scheme? [ In reply to ]
On Wednesday, February 7, 2024 10:59:38 PM CET Wols Lists wrote:
> On 07/02/2024 11:11, J. Roeleveld wrote:
> > On Tuesday, February 6, 2024 9:27:35 PM CET Wols Lists wrote:
> >> On 06/02/2024 13:12, J. Roeleveld wrote:
> >>>> Clearly Oracle likes this state of affairs. Either that, or they are
> >>>> encumbered in some way from just GPLing the ZFS code. Since they on
> >>>> paper own the code for both projects it seems crazy to me that this
> >>>> situation persists.
> >>>
> >>> GPL is not necessarily the best license for releasing code. I've got
> >>> some
> >>> private projects that I could publish. But before I do that, I'd have to
> >>> decide on a License. I would prefer something other than GPL.
> >>
> >> Okay. What do you want to achieve. Let's just lump licences into two
> >> categories to start with and ask the question "Who do you want to free?"
> >
> > I want my code to be usable by anyone, but don't want anyone to fork it
> > and
> > start making money off of it without giving me a fair share.
>
> Okay, that instantly says you want a copyleft licence. So you're stuck
> with a GPL-style licence, and if they want to include it in a commercial
> closed source product, they need to come back to you and dual licence it.
>
> Personally, I'd go the MPL2 route, but that's my choice. It might not
> suit you. But to achieve what you want, you need a copyleft, GPL-style
> licence.

I'll have a look at that one.

> >> If that sounds weird, it's because both Copyleft and Permissive claim to
> >> be free, but have completely different target audiences. Once you've
> >> answered that question, it'll make choosing a licence so much easier.
> >>
> >> GPL gives freedom to the END USER. It's intended to protect the users of
> >> your program from being held to ransom.
> >
> > That's not how the kernel devs handle the GPL. They use it to remove
> > choice
> > from the end user (me) to use what I want (ZFS).
> > And it's that which I don't like about the GPL.
>
> No. That's Oracle's fault. The kernel devs can't include ZFS in linux,
> because Oracle (or rather Sun, at the time, I believe) deliberately
> *designed* the ZFS licence to be incompatible with the GPL.

Maybe not included fully into the kernel, but there is nothing preventing it
to be packaged with a Linux distribution.
It's just the hostility from Linus Torvalds and Greg Kroah-Hartman against ZFS
causing the issues.

See the following post for a clear description (much better written than I
can):
https://eerielinux.wordpress.com/2019/01/28/zfs-and-gpl-terror-how-much-freedom-is-there-in-linux/

Especially the lkml thread linked from there:
https://lore.kernel.org/lkml/20190110182413.GA6932@kroah.com/

> After all, there's nothing stopping *you* from combining Linux and ZFS,
> it's just that somebody else can't do that for you, and then give you
> the resulting binary.

Linux (kernel) and ZFS can't be merged. Fine.
But, Linux (the OS, as in, kernel + userspace) and ZFS can be merged legally.

> At the end of the day, if someone wants to be an arsehole, there's not a
> lot you can do to stop them, and with ZFS that honour apparently goes to
> Sun.

See what I put above.

--
Joost
Re: Re: Suggestions for backup scheme? [ In reply to ]
On Wednesday, February 7, 2024 10:50:07 PM CET Wols Lists wrote:
> On 07/02/2024 11:07, J. Roeleveld wrote:
> >> Because snapshotting uses so much less space?
> >>
> >> So much so that, for normal usage, I probably have no need to delete any
> >> snapshots, for YEARS?
> >
> > My comment was based on using rsync to copy from the source to the backup
> > filesystem.
>
> Well, that's EXACTLY what I'm doing too. NO DIFFERENCE. Actually, there
> is a minor difference - because I'm using lvm, I'm also using rsync's
> "overwrite in place" switch. In other words, it compares source and
> destination *in*place*, and if any block has changed, it overwrites the
> change, rather than creating a complete new copy.

I must have missed that in the man-page last time I used rsync. Will have to
recheck and update my notes just in case I need to use rsync again in the
future.

> Because lvm is COW, that means I have two copies of the file, in two
> different snapshots, but inasmuch as the files are identical, there's
> only one copy of the identical bits.
>
> >> Okay, space is not an expensive commodity, and you don't want too many
> >> snapshots, simply because digging through all those snapshots would be a
> >> nightmare, but personally I wouldn't use a crude rsync simply because I
> >> prefer to be frugal in my use of resources.
> >
> > What is "too many"?
> > I currently have about 1800 snapshots on my server. Do have a tool that
> > ensures it doesn't get out of hand and will remove several over time.
>
> "Too many" is whatever you define it to be. I'm likely to hang on to my
> /home snapshots for yonks. My / snapshots, on the other hand, I delete
> anything more than a couple of months old.
>
> If I can store several years of /home snapshots without running out of
> space, why shouldn't I? The problem, if I *am* running out of space, I'm
> going to have to delete a *lot* of snapshots to make much difference...

One of the things I didn't like about LVM was that it would have trouble
dealing with a lot (100+, due to a bug in my script at the time) of snapshots.
And having to manually (or using a script) increase the size given to these
snapshots when a lot of changes are occuring.

ZFS doesn't have this "max amount of changes", but will happily fill up the
entire pool keeping all versions available.
But it was easier to add zpool monitoring for this on ZFS then it was to add
snapshot monitoring to LVM.

I wonder, how do you deal with snapshots getting "full" on your system?

--
Joost
Re: Re: Suggestions for backup scheme? [ In reply to ]
On 08/02/2024 06:32, J. Roeleveld wrote:
>> Personally, I'd go the MPL2 route, but that's my choice. It might not
>> suit you. But to achieve what you want, you need a copyleft, GPL-style
>> licence.

> I'll have a look at that one.

Basically, each individual source file is copyleft, but not the work as
a whole. So if anybody copies/modifies YOUR work, they have to
distribute your work with their binary, but this requirement does not
extend to everyone else's work.
>
> Maybe not included fully into the kernel, but there is nothing preventing it
> to be packaged with a Linux distribution.
> It's just the hostility from Linus Torvalds and Greg Kroah-Hartman against ZFS
> causing the issues.
>
> See the following post for a clear description (much better written than I
> can):
> https://eerielinux.wordpress.com/2019/01/28/zfs-and-gpl-terror-how-much-freedom-is-there-in-linux/
>
> Especially the lkml thread linked from there:
> https://lore.kernel.org/lkml/20190110182413.GA6932@kroah.com/
>
>> After all, there's nothing stopping*you* from combining Linux and ZFS,
>> it's just that somebody else can't do that for you, and then give you
>> the resulting binary.

> Linux (kernel) and ZFS can't be merged. Fine.

But they can.

> But, Linux (the OS, as in, kernel + userspace) and ZFS can be merged legally.
>
Likewise here, they can.

The problem is, the BINARY can NOT be distributed. And the problem is
the ZFS licence, not Linux.

What Linus, and the kernel devs, and that crowd *think* is irrelevant.
What matters is what SUSE, and Red Hat, and Canonical et al think. And
if they're not prepared to take the risk of distributing the kernel with
ZFS built in, because they think it's a legal minefield, then that's
THEIR decision.

That problem doesn't apply to gentoo, because it distributes the linux
kernel and ZFS separately, and combines them ON THE USER'S MACHINE. But
the big distros are not prepared to take the risk of combining linux and
ZFS, and distributing the resulting *derived* *work*.

Cheers,
Wol
Re: Re: Suggestions for backup scheme? [ In reply to ]
On 08/02/2024 06:38, J. Roeleveld wrote:
> ZFS doesn't have this "max amount of changes", but will happily fill up the
> entire pool keeping all versions available.
> But it was easier to add zpool monitoring for this on ZFS then it was to add
> snapshot monitoring to LVM.
>
> I wonder, how do you deal with snapshots getting "full" on your system?

As far as I'm, concerned, snapshots are read-only once they're created.
But there is a "grow the snapshot as required" option.

I don't understand it exactly, but what I think happens is when I create
the snapshot it allocates, let's say, 1GB. As I write to the master
copy, it fills up that 1GB with CoW blocks, and the original blocks are
handed over to the backup snapshot. And when that backup snapshot is
full of blocks that have been "overwritten" (or in reality replaced),
lvm just adds another 1GB or whatever I told it to.

So when I delete a snapshot, it just goes through those few blocks,
decrements their use count (if they've been used in multiple snapshots),
and if the use count goes to zero they're handed back to the "empty" pool.

All I have to do is make sure that the sum of my snapshots does not fill
the lv (logical volume). Which in my case is a raid-5.

Cheers,
Wol
Re: Re: Suggestions for backup scheme? [ In reply to ]
On Thursday, February 8, 2024 6:36:56 PM CET Wols Lists wrote:
> On 08/02/2024 06:32, J. Roeleveld wrote:
> >> After all, there's nothing stopping*you* from combining Linux and ZFS,
> >> it's just that somebody else can't do that for you, and then give you
> >> the resulting binary.
> >
> > Linux (kernel) and ZFS can't be merged. Fine.
>
> But they can.

Not if you want to release it

> > But, Linux (the OS, as in, kernel + userspace) and ZFS can be merged
> > legally.
> Likewise here, they can.
>
> The problem is, the BINARY can NOT be distributed. And the problem is
> the ZFS licence, not Linux.

You can distribute the binary of both, just not embedded into a single binary.

> What Linus, and the kernel devs, and that crowd *think* is irrelevant.

It is, as they are actively working on removing API calls that filesystems like
ZFS actually need and hiding them behind a GPL wall.

> What matters is what SUSE, and Red Hat, and Canonical et al think. And
> if they're not prepared to take the risk of distributing the kernel with
> ZFS built in, because they think it's a legal minefield, then that's
> THEIR decision.

I'm not talking about distributing ZFS embedded into the kernel. It's
perfectly fine to distribute a distribution with ZFS as a kernel module. The
issue is caused by the linux kernel devs blocking access to (previously
existing and open) API calls and limiting them to GPL only.

> That problem doesn't apply to gentoo, because it distributes the linux
> kernel and ZFS separately, and combines them ON THE USER'S MACHINE. But
> the big distros are not prepared to take the risk of combining linux and
> ZFS, and distributing the resulting *derived* *work*.

I would class Ubuntu as a big distribution and proxmox is also used a lot.
Both have ZFS support.

--
Joost
Re: Re: Suggestions for backup scheme? [ In reply to ]
On Thursday, February 8, 2024 6:44:50 PM CET Wols Lists wrote:
> On 08/02/2024 06:38, J. Roeleveld wrote:
> > ZFS doesn't have this "max amount of changes", but will happily fill up
> > the
> > entire pool keeping all versions available.
> > But it was easier to add zpool monitoring for this on ZFS then it was to
> > add snapshot monitoring to LVM.
> >
> > I wonder, how do you deal with snapshots getting "full" on your system?
>
> As far as I'm, concerned, snapshots are read-only once they're created.
> But there is a "grow the snapshot as required" option.
>
> I don't understand it exactly, but what I think happens is when I create
> the snapshot it allocates, let's say, 1GB. As I write to the master
> copy, it fills up that 1GB with CoW blocks, and the original blocks are
> handed over to the backup snapshot. And when that backup snapshot is
> full of blocks that have been "overwritten" (or in reality replaced),
> lvm just adds another 1GB or whatever I told it to.

That works with a single snapshot.
But, when I last used LVM like this, I would have multiple snapshots. When I
change something on the LV, the original data would be copied to the snapshot.
If I would have 2 snapshots for that LV, both would grow at the same time.

Or is that changed in recent versions?

> So when I delete a snapshot, it just goes through those few blocks,
> decrements their use count (if they've been used in multiple snapshots),
> and if the use count goes to zero they're handed back to the "empty" pool.

I know this is how ZFS snapshots work. But am not convinced LVM snapshots work
the same way.

> All I have to do is make sure that the sum of my snapshots does not fill
> the lv (logical volume). Which in my case is a raid-5.

I assume you mean PV (Physical Volume)?

I actually ditched the whole idea of raid-5 when drives got bigger than 1TB. I
currently use Raid-6 (or specifically RaidZ2, which is the ZFS "equivalent")

--
Joost
Re: Re: Suggestions for backup scheme? [ In reply to ]
On 09/02/2024 12:57, J. Roeleveld wrote:
>> I don't understand it exactly, but what I think happens is when I create
>> the snapshot it allocates, let's say, 1GB. As I write to the master
>> copy, it fills up that 1GB with CoW blocks, and the original blocks are
>> handed over to the backup snapshot. And when that backup snapshot is
>> full of blocks that have been "overwritten" (or in reality replaced),
>> lvm just adds another 1GB or whatever I told it to.

> That works with a single snapshot.
> But, when I last used LVM like this, I would have multiple snapshots. When I
> change something on the LV, the original data would be copied to the snapshot.
> If I would have 2 snapshots for that LV, both would grow at the same time.
>
> Or is that changed in recent versions?

Has what changed? As I understand it, the whole point of LVM is that
everything is COW. So any individual block can belong to multiple snapshots.

When you write a block, the original block is not changed. A new block
is linked in to the current snapshot to replace the original. The
original block remains linked in to any other snapshots.

So disk usage basically grows by the number of blocks you write. Taking
a snapshot will use just a couple of blocks, no matter how large your LV is.
>
>> So when I delete a snapshot, it just goes through those few blocks,
>> decrements their use count (if they've been used in multiple snapshots),
>> and if the use count goes to zero they're handed back to the "empty" pool.
> I know this is how ZFS snapshots work. But am not convinced LVM snapshots work
> the same way.
>
>> All I have to do is make sure that the sum of my snapshots does not fill
>> the lv (logical volume). Which in my case is a raid-5.
> I assume you mean PV (Physical Volume)?

Quite possibly. VG, PV, LV. I know which one I need (by reading the
docs), I don't particularly remember which is which off the top of my head.
>
> I actually ditched the whole idea of raid-5 when drives got bigger than 1TB. I
> currently use Raid-6 (or specifically RaidZ2, which is the ZFS "equivalent")
>
Well, I run my raid over dm-integrity so, allegedly, I can't suffer disk
corruption. My only fear is a disk loss, which raid-5 will happily
recover from. And I'm not worried about a double failure - yes it could
happen, but ...

Given that my brother's ex-employer was quite happily running a raid-6
with maybe petabytes of data, over a double disk failure (until an
employee went into the data centre and said "what are those red
lights"), I don't think my 20TB of raid-5 is much :-)

Cheers,
Wol
Re: Re: Suggestions for backup scheme? [ In reply to ]
On Friday, 9 February 2024 15:48:45 GMT Wols Lists wrote:

> ... And I'm not worried about a double failure - yes it could happen,
> but ...
>
> Given that my brother's ex-employer was quite happily running a raid-6
> with maybe petabytes of data, over a double disk failure (until an
> employee went into the data centre and said "what are those red
> lights"), I don't think my 20TB of raid-5 is much :-)

[OT - anecdote]

I used to work in power generation and transmission (CEGB, for those with long
memories), in which every system was required to be fault tolerant - one fault
at a time. As Wol says, that's fine until your one fault has appeared and not
been noticed. Then another fault appears - and the reactor shuts down!
Carpeting comes next...

Oh, frabjous day!

[/OT]

--
Regards,
Peter.

1 2 3  View All