ZFS, Btrfs and mdadm: Battle tested in a RAID-5 setup

Introduction

Interested in Btrfs and ZFS and wondering how they deal with various problems? This article shares the results of a home-lab experiment in which I threw some different problems at ZFS, Btrfs and mdadm+dm-integrity in a RAID-5 setup. I also share some simple advice about dealing with problems in your storage array based on my experiments.

Myths and misunderstandings

There are a lot of myths and misunderstandings about ZFS and Btrfs. These are some of the myths and misunderstandings that I'll address in this article: Myth: ZFS requires tons of memory! Myth: Red Hat has removed Btrfs because they consider it useless! Myth: ZFS and Btrfs requires ECC memory! Myth: Restoring a RAID-5 puts more stress on the drives! Myth: Using USB disk devices with ZFS or Btrfs is okay! Myth: Btrfs still has the write hole issue and is completely useless! Myth: Btrfs is abandoned! Myth: mdadm+XYZ can replace ZFS or Btrfs!

Some advice

Most data loss reported on the mailing lists of ZFS, Btrfs, and mdadm, is down to user error while attempting to recover a failed array. Never use a trial-and-error approach when something goes wrong with your filesystem or backup solution!

Before I begin, here's some advice:

If you value your data, always backup your important data. No RAID setup is a replacement for proper backup.

ZFS RAID-Z

Let's begin the testing with ZFS.

In this setup, I have three disks and will therefore use a RAID-Z (RAID-5) which can withstand the loss of one disk. The pool will still function, but I need to "resilver" the pool as soon as possible with a replacement drive.

I'm going to create a RAID-Z pool with the devices /dev/disk/by-id/ata-ST31000340NS_9QJ089LF, /dev/disk/by-id/ata-ST31000340NS_9QJ0EQ1V, and /dev/disk/by-id/ata-ST31000340NS_9QJ0F2YQ:

zpool create tank raidz /dev/disk/by-id/ata-ST31000340NS_9QJ089LF /dev/disk/by-id/ata-ST31000340NS_9QJ0EQ1V /dev/disk/by-id/ata-ST31000340NS_9QJ0F2YQ

ZFS - Power outage

A RAID-5 can withstand the loss of one drive. Let's simulate a power outage during write operations and see how ZFS deals with it.

dd if=/dev/urandom of=/tank/dd.out.tar bs=1m &
echo $! &
sudo poweroff -f &
sudo zpool status tank

ZFS - Drive failure

Lets simulate a drive failure. I'll stop the array and remove one of the disks.

sudo zpool stop tank
echo "deleting disk from array"
sudo zpool fail tank
sudo zpool status tank

ZFS - Drive failure during file transfer

Lets simulate a drive failure while transferring a large file.

time dd if=/dev/urandom of=/tank/dd.out.tar bs=1m &
echo $! &
sudo poweroff -f &
sudo zpool status tank

ZFS - Data corruption during file transfer

Lets simulate a data corruption during a file transfer.

time dd if=/dev/urandom of=/tank/dd.out.tar bs=1m &
time dd if=/tank/dd.out.tar of=/tank/dd.out.tar bs=1m &
echo $! &
sudo poweroff -f &
sudo zpool status tank

ZFS - The dd mistake

Time for a little fun. Lets simulate a case where a user accidentally issues a dd of=/dev/zero of=/tank/dd.out.tar bs=1G on the same pool.

dd if=/dev/zero of=/tank/dd.out.tar bs=1G &
echo $! &
sudo poweroff -f &
sudo zpool status tank

ZFS - A second drive failure during a replacement

Lets simulate a second drive failure while replacing a failed drive.

sudo zpool replace tank sddata-ST31000340NS_9QJ0EQ1V
sudo zpool status tank

Btrfs RAID-5

Now lets switch over to Btrfs.

btrfs scrub start -d /mnt/btrfs_raid5

Btrfs - Power outage

A RAID-5 can withstand the loss of one drive. Let's simulate a power outage during write operations and

Read more