ZFS On Linux for Compressed & Encrypted Off-site Backups on External HDDs

Posted on 01/08/2016 by Brian Carey

In this post we will cover a recent implementation for a client with a somewhat unique backup requirement. For years, this client has been running on a custom solution we put in place leveraging Bacula, standard database dump utilities, and rsync as needed to backup various devices on their network to a dedicated backup server. The backup server stores all backups locally for a period of time based on type, but also rolls the data to a set of LUKS encrypted offsite HDDs that are attached via a 4 bay, USB-3 enclosure. These drives are rotated offsite weekly in a 4 week cycle. There is nothing fancy about it, Bacula does most of the heavy lifting with some custom scripts mixed in for databases and or devices that are more easily handled outside of Bacula. However, due to recent changes, they now needed to store what will be multiple large batches of data (~20 TB each) per week, indefinitely both on and off site. It became clear very early that our existing toolset wasn't going to be an efficient option.

The Requirements

Be able to store large batches of data both on site and off site indefinitely using our existing dedicated backup server running CentOS 6 Linux.
At least initially all of the data will need to be backed up from a Linux server in a remote office over a pre-existing VPN tunnel.
The data will not remain on the creating system indefinitely but will not be deleted for at least several weeks after creation.
While some of the data is already compressed using gzip, the rest is not and a lot of that data is raw text. Ideally we'd be compressing as much as possible to make the most out of our storage purchases.
The compression cannot happen on the creating system, only on the backup system.
The drives going off site must be encrypted to meet legal requirements.
Data storage speed, while not critical, needs to at least be relatively efficient due to the exceptionally large sizes.

The Plan

As with most cases where I need to backup a Linux system over the network, Rsync immediately comes to the top of the list. Its tried and true and makes life easy. We decide we will purchase 5 x 4TB HDDs for the existing external enclosure. One will remain in the office for the on site storage. The remaining four will be rotated off-site using our normal schedule. Once the drives are full we'll repeat the process. A quick script is created to Rsync the needed locations from the remote system to backup system and we start doing some testing. Using Rsync's built in transfer compression we shave off some time on the transfer from remote to local and things are working just as expected.

Next, we decide we'll now compress all uncompressed data on the backup server using gzip. For example:

$ find /backup/path -type f -not -name "*.gz" -exec gzip {} \;

After a little time that completes and we've cut our space utilization in half for a batch, looking good!

The Fail

Of course the whole point of using Rsync is that we can simply re-run the same remote site sync as needed copying updates only to the backup server. Then we can also run Rsync to sync those changes to the encrypted off-site HDDs. In our haste to compress and save space, we've changed our data on the backup server! The next run of the remote to backup server sync results in...a re-copy of all uncompressed files. We've failed!

The New Plan

Various ideas start being thrown around. None of them sound very good and all of them involve writing more custom scripts and to try to compress only when copying the data to the off-site drives. I keep poking holes through them and thinking of ways data can change on the source server but not get updated on the off-site drives. Then I happen to think about ZFS and its built in compression. This would allow us to use Rsync without manually compressing individual files since the file system will take care of that. I use it daily on FreeBSD, but I honestly haven't kept up on its status on Linux. I do some research and it seems that its generally considered stable. We have a new plan! I do some initial testing with a simple LVM partition just to make sure the compression is in line with what we saw before and it is so lets go with it.

The Implementation

After a little more research its determined to use "ZFS On Linux" vs the "Fuse-ZFS" implementation due to it being more current. Lets install it as follows:

# yum localinstall --nogpgcheck http://archive.zfsonlinux.org/epel/zfs-release.el6.noarch.rpm
# yum install kernel-devel zfs

Since I also updated the kernel as part of this I rebooted and verified my normal ZFS commands functioned.

Now I'm ready to configure my HDDs. For the purposes of this post I'm only going to cover how to configure the encrypted drives going off-site but the drive that will be staying local is configured in a similar fashion just without the encryption. In this case my HDD device is /dev/sdx and is only a 2TB drive for example purposes

First we need to partition the drive using parted. In our case, we actually need a tiny partition that remains unencrypted to allow our custom auto-mount script to be able to handle the LUKS partitions (not covered here) in addition to our main data partition.

# parted -a optimal /dev/sdx
(parted) mklabel gpt
Warning: The existing disk label on /dev/sdx will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? Yes
(parted) unit MB
(parted) mkpart primary 0% 1GB
(parted) mkpart primary 1GB 100%
(parted) p
Model: ST2000DM 001-9YN164 (scsi)
Disk /dev/sdi: 2000399MB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End        Size       File system  Name     Flags
 1      1.05MB  1000MB     999MB                   primary
 2      1000MB  2000399MB  1999399MB               primary

Next, we setup /dev/sdx2 as our LUKS encrypted partition. I enter our standard password when prompted.

# cryptsetup --cipher aes-cbc-essiv:sha256 --verbose --verify-passphrase luksFormat /dev/sdx2

Next, we add our standard encryption key.

# cryptsetup luksAddKey /dev/sdx2 /path/to/our/key

Next, we open the device created in the previous step in device mapper.

# cryptsetup luksOpen --key-file /path/to/our/key /dev/sdx2 example_zpool

Now its time to create our Zpool. We set the default mount point, turn on compression using gzip, and also use the ashift=12 option for advanced format drives (if needed). Note that there are other options for compression algorithms, and gzip is known to be slower than others but with better compression. In our case we want to save as much space as possible and aren't worried about it being slightly slower.

# zpool create -O mountpoint=/offsite/mountpoint -O compression=gzip -o ashift=12 example_zpool /dev/mapper/example_zpool

At this point, one could create one or more ZFS file systems on top of the pool. In our case, we just want one large chunk of data so we'll use it as is.

When its time to unmount the drive to rotate it off-site, the process will be:

# zpool export example_zpool
# cryptsetup luksClose /dev/mapper/example_zpool

And then when its time to re-mount an incoming drive, the process will be:

# cryptsetup luksOpen --key-file /path/to/our/key /dev/sdx2 example_zpool
# zpool import example_zpool

That pretty much covers it. We can now Rsync from the remote location to the backup server and the data is encrypted to the extent possible. Then we can Rsync from the backup server to the off-site HDD's and the data is encrypted and compressed. Obviously the type of data being stored in this setup will make or break the effectiveness of the compression.

Please don't hesitate to contact us with any questions.