ZFS, Btrfs and mdadm: Battle tested in a RAID-5 setup
Introduction
Interested in Btrfs and ZFS and wondering how they deal with various problems? This article shares the results of a home-lab experiment in which I threw some different problems at ZFS, Btrfs and mdadm+dm-integrity in a RAID-5 setup. I also share some simple advice about dealing with problems in your storage array based on my experiments.
Myths and misunderstandings
There are a lot of myths and misunderstandings about ZFS and Btrfs. These are some of the myths and misunderstandings that I'll address in this article: Myth: ZFS requires tons of memory! Myth: Red Hat has removed Btrfs because they consider it useless! Myth: ZFS and Btrfs requires ECC memory! Myth: Restoring a RAID-5 puts more stress on the drives! Myth: Using USB disk devices with ZFS or Btrfs is okay! Myth: Btrfs still has the write hole issue and is completely useless! Myth: Btrfs is abandoned! Myth: mdadm+XYZ can replace ZFS or Btrfs!
Some advice
Most data loss reported on the mailing lists of ZFS, Btrfs, and mdadm, is down to user error while attempting to recover a failed array. Never use a trial-and-error approach when something goes wrong with your filesystem or backup solution!
Before I begin, here's some advice:
If you value your data, always backup your important data. No RAID setup is a replacement for proper backup.
ZFS RAID-Z
Let's begin the testing with ZFS.
In this setup, I have three disks and will therefore use a RAID-Z (RAID-5) which can withstand the loss of one disk. The pool will still function, but I need to "resilver" the pool as soon as possible with a replacement drive.
I'm going to create a RAID-Z pool with the devices /dev/disk/by-id/ata-ST31000340NS_9QJ089LF, /dev/disk/by-id/ata-ST31000340NS_9QJ0EQ1V, and /dev/disk/by-id/ata-ST31000340NS_9QJ0F2YQ:
zpool create tank raidz /dev/disk/by-id/ata-ST31000340NS_9QJ089LF /dev/disk/by-id/ata-ST31000340NS_9QJ0EQ1V /dev/disk/by-id/ata-ST31000340NS_9QJ0F2YQ
ZFS - Power outage
A RAID-5 can withstand the loss of one drive. Let's simulate a power outage during write operations and see how ZFS deals with it.
dd if=/dev/urandom of=/tank/dd.out.tar bs=1m &
echo $! &
sudo poweroff -f &
sudo zpool status tank
ZFS - Drive failure
Lets simulate a drive failure. I'll stop the array and remove one of the disks.
sudo zpool stop tank
echo "deleting disk from array"
sudo zpool fail tank
sudo zpool status tank
ZFS - Drive failure during file transfer
Lets simulate a drive failure while transferring a large file.
time dd if=/dev/urandom of=/tank/dd.out.tar bs=1m &
echo $! &
sudo poweroff -f &
sudo zpool status tank
ZFS - Data corruption during file transfer
Lets simulate a data corruption during a file transfer.
time dd if=/dev/urandom of=/tank/dd.out.tar bs=1m &
time dd if=/tank/dd.out.tar of=/tank/dd.out.tar bs=1m &
echo $! &
sudo poweroff -f &
sudo zpool status tank
ZFS - The dd mistake
Time for a little fun. Lets simulate a case where a user accidentally issues a dd of=/dev/zero of=/tank/dd.out.tar bs=1G on the same pool.
dd if=/dev/zero of=/tank/dd.out.tar bs=1G &
echo $! &
sudo poweroff -f &
sudo zpool status tank
ZFS - A second drive failure during a replacement
Lets simulate a second drive failure while replacing a failed drive.
sudo zpool replace tank sddata-ST31000340NS_9QJ0EQ1V
sudo zpool status tank
Btrfs RAID-5
Now lets switch over to Btrfs.
btrfs scrub start -d /mnt/btrfs_raid5
Btrfs - Power outage
A RAID-5 can withstand the loss of one drive. Let's simulate a power outage during write operations and