Up until a recent overhaul, I was using btrfs in raid1 to manage the 4 drives I had in my NAS. However it’s been clear for a while that the momentum is behind zfs. It has more features, better stability, and generally inspires much more confidence when things go wrong. btrfs still has its place in managing single-device boot volumes, but for multiple physical devices, I would definitely recommend zfs over btrfs.
When I added a couple of new 16TB disks, I opted to create a new pool with a single mirror vdev. If I need to expand it in future, I’ll add another mirrored vdev to the pool.
Incidentally this is one of the strengths of btrfs – you can add single disks to a raid1 volume, and btrfs will happily expand to use the storage. But it’s worth nothing that “raid1” with btrfs is not raid at all in the traditional sense. It simply means “keep two copies of every block on separate devices”. The tradeoff though, is that you can never survive more than a single disk failure. Any two failed disks are certain to contain at least one block and its mirror, so data loss is practically guaranteed, even if you have 10 disks.
With a 10-disk zfs pool of 5 mirrored pairs in raid1 (which is functionally the same as traditional raid10), you could theoretically withstand up to 5 failed disks, so long as no two are from the same vdev. But of course you can only expand the pool by adding two drives at a time (well, technically you could add a single drive as a vdev, but the loss of any vdev means the pool is toast, so it would be very risky for your data).
Enabling encryption and compression
These are two things I got wrong when setting up my zfs pool – I didn’t know about encryption or compression!
When I created the pool I did:
zpool create -o ashift=12 pool1 mirror /dev/disk/by-id/wwn-xxx /dev/disk/by-id/wwn-xxx
zfs create pool1/photos
zfs create pool1/videos
zfs create pool1/backup
This worked fine, but the resulting volumes were unencrypted and uncompressed, which is particularly undesirable for the backup volume, which contains time machine backups from my laptop – data which should be fairly compressible, and definitely should be encrypted.
Much later I did:
zfs set compression=zstd-7 pool1/backup
But that only affects files written in future, and doesn’t compress data that already exists.
Encryption is similar in that there is no way to encrypt existing files without rewriting the volume. I found Jim Salter’s quick-start guide on arstechnica an invaluable resource, but the short of it is that to compress and encrypt my backups I ran a zfs send | zfs receive
job against a snapshot to migrate the data:
# Create the keyfile:
dd if=/dev/urandom bs=32 count=1 of=/etc/zfs/pool1.keyfile
# Create a snapshot:
zfs snapshot -r pool1/backup@20231008-encrypt
# Send the snapshot to a new volume with encryption and compression set:
zfs send pool1/backup@20231008-encrypt |
zfs receive \
-o encryption=on \
-o compress=zstd-7 \
-o keyformat=raw \
-o keylocation=file:///etc/zfs/pool1.keyfile \
pool1/backup-encrypted
# Destroy the original and rename the new volume
zfs destroy /pool1/backup
zfs rename pool1/backup-encrypted pool1/backup
Performance
It took a while.
The backup volume is 2.8TB, and I’m running only two mirrored disks, which are doing both the reads and writes, so it took over 12 hours. The data rate was around 50 MB/sec, which isn’t too bad considering all the reads and writes are happening on the same mirror, and we’re doing compression and encryption at the same time, but it is very slow when dealing with terrabytes of data. With more disks you should see better performance.
This server might be old now, but it’s an i7-4790T with 4 cores / 8 threads, and 8GB ram, so quite powerful in the realm of NAS hardware. I think any desktop-class machine you build these days would be fine with zfs, but once you start using features such as encryption and compression it really starts to load up and you’re going to want at least 4 cores.
While doing the above migration, the load sat around 12 and there would regularly by 6 threads using 40-80% CPU each. I’m pretty sure the bottleneck was the disks in my case, but if I add a couple more mirrored vdevs to the pool that might start to change. 3×2 should be fine for normal use, but the CPU could start to become a limitation when copying large amounts of data around. I could certainly help that by using faster compression than zstd-7 though, so I’ll see how the performance is during backups and possibly lower it for future writes.
Been following for years, and always find your experiences fascinating. Thanks for continuing to share with the community!
Thank you Michael, I often wonder if anyone reads these things!