Migrating an existing CentOS7 installation from ext4 to ZFS [ROOT on ZFS]

As a long term ZFS user I have been using it on most of my production and home servers. But how about client machines?. Bringing all its nice features  (snapshots,clones,checksums etc) on my laptop, desktop machine sounds like a great idea.Since ZFS is not included in mainline Linux kernel some Linux distributions have decided to integrate ZFS (ZoL) into their repositories. Such examples is Ubuntu, Arch and Gentoo Linux. CentOS also supports ZFS via its ELREPO repository. This guide is about  the last one, (CentOS) since that’s what I mainly use at my work. I decided to share my experience by writing  this guide hoping that it would be interesting for you as well.

What you will need:

– Ubuntu 16.04 LTS Desktop Live CD.
– An existing CentOS7 installation (ext4,xfs or whatever).
– A spare hard drive to be used for the ZFS installation.

Initial preparation:

  1. Connect the spare hard drive(where ZFS is to be configured) into the machine. This can be done either internally by using SATA cable or externally via a USB case for example.The spare hard drive capacity must be equal or larger than the source drive.
  2. Use Ubuntu LiveCD to boot the machine and select “Try Ubuntu” option.
  3. Once in the Ubuntu live environment, open the Terminal.
  4. First thing we need to do is to download ZFS packages for Ubuntu:

    root@ubuntu:/# apt-add-repository universe
    ‘universe’ distribution component enabled for all sources.

    root@ubuntu:/# apt update && apt -y install zfs-initramfs

    Ign:1 cdrom://Ubuntu 16.04.3 LTS _Xenial Xerus_ – Release amd64 (20170801) xenial InReleaseHit:2 cdrom://Ubuntu 16.04.3 LTS _Xenial Xerus_ – Release amd64 (20170801) xenial ReleaseGet:3 http://security.ubuntu.com/ubuntu xenial-security InRelease [102 kB] Hit:5 http://archive.ubuntu.com/ubuntu xenial InRelease Get:6 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages [396 kB]……The following additional packages will be installed: libnvpair1linux libuutil1linux libzfs2linux libzpool2linux zfs-doc zfs-zed zfsutils-linuxSuggested packages: default-mta | mail-transport-agent nfs-kernel-serverThe following NEW packages will be installed: libnvpair1linux libuutil1linux libzfs2linux libzpool2linux zfs-doc zfs-initramfs zfs-zed zfsutils-linux0 upgraded, 8 newly installed, 0 to remove and 284 not upgraded.Need to get 901 kB of archives……Setting up zfs-initramfs (0.6.5.6-0ubuntu18) …Processing triggers for libc-bin (2.23-0ubuntu9) …Processing triggers for systemd (229-4ubuntu19) …Processing triggers for ureadahead (0.100.0-19) …Processing triggers for initramfs-tools (0.122ubuntu8.8) …update-initramfs is disabled since running on read-only media

  5. Now you should have all you need to create a ZFS pool, so let’s proceed further and that would be  disk partitioning. Normally ZFS uses whole drives to save data, but in this case we need to reserve a small partition [~1MB] for GRUB to install the boot loader.So, we’ll have to create 2 partitions, one for GRUB and a second for ZFS. Assuming that the destination hard disk is sdb we use the following commands to create the partitions into it:5a. First clear any previous partitions on the destination disk.

    root@ubuntu:/#sgdisk –zap-all /dev/sdb

    5b. Then create a boot partition to be used by GRUB. Use this for legacy (BIOS) booting.

    root@ubuntu:/#sgdisk -a1 -n2:34:2047 -t2:EF02 /dev/sdb

    5c. Now create the 2nd partition to be used for ZFS data.

    root@ubuntu:/#sgdisk -n1:0:0 -t1:BF01 /dev/sdb5d. List partitionsroot@ubuntu:/#gdisk -l /dev/sdb

    Number Start (sector) End (sector) Size Code Name
    1           2048               488397134  232.9 GiB BF01  –> ZFS Data
    2           34                  2047            1007.0 KiB EF02 –> GRUB boot

  6. Now that we got our hard drive partitioned, we need to create a ZFS pool onto it. One very important thing to note here is that ZFS does not play well with “/dev/sdb, /dev/sdc etc” namings for hard disks. What you will have to do is to use the “/dev/disk/by-id/xxxxxx” naming scheme.Be very careful at this point to select the correct hard drive and partition (you can use gdisk -l /dev/disk/by-id/<drive_name>” to verify the partitions you created previously).Since I’m doing these tests in a virtual machine, you will notice that the hard disk is showing up as “ata-VBOX…“. Make sure that you replace that with your hard drive.** Create ZFS pool on the disk.Be sure to add the “-part1” at the end, otherwise zpool command will overwrite boot parition. **

    root@ubuntu:/#zpool create -O atime=off -O canmount=off -O compression=lz4 -O mountpoint=/ -R /mnt rpool /dev/disk/by-id/ata-VBOX_HARDDISK_VB7d5f9023-2bfafc91-part1 -f

    ** List pool **
    root@ubuntu:/#zpool listNAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
    rpool   79.5G 274K   79.5G        –             0%     0%     1.00x   ONLINE      /mntSome important things to note on the above command and its results are the following:

    – The pool will be created with access time property disabled.
    – It should not mount itself.
    – The compression algorithm will be LZ4.
    – Default mountpoint will be (/). This is going to be used to properly mount CentOS ROOT fs later.
    – The alternate mount point will be (/mnt). This is the temporary mount point to be used during the live session (Ubuntu LiveCD) to mount the pool. It’s perfectly fine to select something else there, for example (/rpool). If you decide to do this, remember to replace (/mnt) with (/rpool) on the commands that will follow later in this guide.

  7. [Optional] Create rest of datasets as needed.This step is used mainly to separate ROOT (/) system from other system directories, like /var/log and /home for example. This is to ensure that those folders contents will be preserved during rollbacks of the ROOT system (i.e after a failed system upgrade).** Create ROOT datasets **

    root@ubuntu:/#zfs create -o canmount=off -o mountpoint=none rpool/ROOT
    root@ubuntu:/#zfs create -o canmount=noauto -o mountpoint=/ rpool/ROOT/centos** Mount ROOT filesystem **

    root@ubuntu:/#zfs mount rpool/ROOT/centos
    root@ubuntu:/#df -h /mnt

    Filesystem                 Size    Used   Avail  Use%  Mounted on
    rpool/ROOT /centos  223G   9.2G   213G   5%       /mnt
  8. ** Create rest of datasets **

    It’s perfectly fine to not create some (or all) of these datasets but they will make your life much easier when in the future you will have to rollback the ROOT system and  you need to preserve the content of the (/home) directory or (/var/log) directory for example. Another thing is that you will have to create most of them by using the “legacy” property. This means that these datasets will not auto mount themselves during boot but rather expect from (/etc/fstab) file to mount them.

    root@ubuntu:/#zfs create -o mountpoint=legacy -o setuid=off rpool/home
    root@ubuntu:/#zfs create -o mountpoint=legacy -o setuid=off rpool/centos-test2
    root@ubuntu:/#zfs create -o mountpoint=legacy rpool/home/root
    root@ubuntu:/#zfs create -o canmount=off -o setuid=off -o exec=off rpool/var
    root@ubuntu:/#zfs create -o mountpoint=legacy -o com.sun:auto-snapshot=false rpool/var/cache
    root@ubuntu:/#zfs create -o mountpoint=legacy rpool/var/log
    root@ubuntu:/#zfs create -o mountpoint=legacy rpool/var/spool
    root@ubuntu:/#zfs create -o mountpoint=legacy -o com.sun:auto-snapshot=false -o exec=on rpool/var/tmp** Create a ZVOL to be used for SWAP **

    This ZFS volume is going to be used as a SWAP partition. Make sure you adjust its size on your system needs.

    root@ubuntu:/#zfs create -V 4G -o compression=zle -o logbias=throughput -o sync=always -o primarycache=metadata -o secondarycache=none -o com.sun:auto-snapshot=false rpool/swap

  9. Copy content from original CentOS7 installation [source drive] to the ZFS pool. In the example below, my CentOS7 installation is located in a LV (root) in a VG named “centos_test”. The “sda” is a temporary mount point I created to mount LV.** Create a mountpoint for mounting original CentOS installation disk, in this example that is “/sda” **

    root@ubuntu:/#mkdir /sda
    root@ubuntu:/#mount /dev/centos_test/root /sda** Mount boot parition (sda1). This is where CentOS7 kernel,initramfs and GRUB files are located in. This is a separate ~500MB ext4 partition **
    root@ubuntu:/#mount /dev/sda1 /sda/boot** rsync all content from source disk to destination pool. In this step, we basically copy the whole CentOS7 installation from drive1[LVM/EXT4] to drive2[ZFS] **

    root@ubuntu:/#rsync -avPX /sda/ /mnt/

    ** Unmount original (source) disk. From this point we don’t need our CentOS7 installation drive anymore, so it’s safe to unmount it **

    root@ubuntu:/#umount -R /sda

  10. Now we need to mount all previously created datasets into (/mnt) and prepare the chroot environment (CentOS). From this point we leave Ubuntu live environment and we chroot to the CentOS7 ZFS environment since there are still things to do like for example to modify the contents of (/etc/fstab) file and configure GRUB.

    root@ubuntu:/# mount -t zfs zpool/var/log /mnt/var/log
    root@ubuntu:/# mount -t zfs zpool/var/tmp /mnt/var/tmp
    root@ubuntu:/# mount -t zfs zpool/var/cache /mnt/var/cache
    root@ubuntu:/# mount -t zfs zpool/var/spool /mnt/var/spool
    root@ubuntu:/# mount -t zfs zpool/home /mnt/home
    root@ubuntu:/# mount -t zfs zpool/home/root /mnt/root

    root@ubuntu:/#mount -o bind /dev /mnt/dev
    root@ubuntu:/#mount -o bind /proc /mnt/proc
    root@ubuntu:/#mount -o bind /sys /mnt/sys
    root@ubuntu:/#chroot /mnt /bin/bash –login

  11. First thing to check once in CentOS7 (chrooted) is to check its version.root@centos-test:/# lsb_release -a
    Distributor ID: CentOS  Distributor ID: CentOS  Description: CentOS Linux release 7.3.1611 (Core) Release: 7.3.1611      Codename: Core
  12. Install the ZFS packages for CentOS as described here: https://github.com/zfsonlinux/zfs/wiki/RHEL-and-CentOS
    ** Take a note of the CenOS7 version.You will need that info for downloading
    ** proper ZFS package as described in the url above.In this case, installed CentOS version is 7.3, so I’m downloading the ZFS package for that version.distro.

    root@centos-test:/# lsb_release -a
    root@centos-test:/# yum install http://download.zfsonlinux.org/epel/zfs-release.el7_3.noarch.rpm
    root@centos-test:/# gpg –quiet –with-fingerprint /etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux** Modify /etc/yum.repos.d/zfs.repo if needed.By default dkms type packages will be used.Please modify if you prefer kABI type packages.**** Install CentOS ZFS packages **

    root@centos-test:/# yum install zfs** Install the ZFS package of dracut to create a ZFS aware initramfs later **root@centos-test:/# yum install zfs-dracut

    ** Modify the (/etc/fstab) file as follows (Remove any existing entries since they do not apply anymore to ZFS CentOS7 setup). **

    rpool/var/cache /var/cache zfs defaults 0 0
    rpool/var/log /var/log zfs defaults 0 0
    rpool/var/spool /var/spool zfs defaults 0 0
    rpool/var/tmp /var/tmp zfs defaults 0 0
    rpool/home /home zfs defaults 0 0
    rpool/home/root /root zfs defaults 0 0
    /dev/zvol/rpool/swap none swap defaults 0 0

  13.  Configuring GRUB and initramfs (dracut)** Modify dracut configuration as follows **

    root@centos-test:/# vi /etc/dracut.conf** comment out and modify this line as follows: add_dracutmodules+=”zfs”

    ** Finally generate a new ZFS aware initramfs. Make sure you use your own kernel version in dracut parameters! **

    root@centos-test:/# dracut -f -M -kver 3.10.0-514.26.2.el7

    Executing: /sbin/dracut -f -M –kver /boot/initramfs-3.10.0-514.26.2.el7.x86_64.img 3.10.0-514.26.2.el7.x86_64
    dracut module ‘busybox’ will not be installed, because command ‘busybox’ could not be found!
    dracut module ‘nbd’ will not be installed, because command ‘nbd-client’ could not be found!
    dracut module ‘biosdevname’ will not be installed, because command ‘biosdevname’ could not be found!
    *** Including module: bash ***
    *** Including module: fips ***
    *** Including module: modsign ***
    ….
    *** Hardlinking files ***
    *** Hardlinking files done ***
    *** Generating early-microcode cpio image contents ***
    *** Constructing AuthenticAMD.bin ****
    *** Constructing GenuineIntel.bin ****
    *** Store current command line parameters ***
    *** Creating image file ***
    *** Creating microcode section ***
    *** Created microcode section ***
    *** Creating image file done ***
    *** Creating initramfs image file ‘/boot/initramfs-3.10.0-514.26.2.el7.x86_64.img’
    done ***

    ***Important! If you get a message saying “unknown filesystem when running grub2-install command, then you will have to compile the latest version of grub from github and run that version of grub-install to install the boot loader ***

     

    root@centos-test:/# vi /etc/default/grub
    ** Modify GRUB_CMDLINE_LINUX as follows (leave only defaults):

    GRUB_CMDLINE_LINUX=”rhgb quiet”

    ** The following line is required to avoid an error when running grub2-mkconfig command later below **

    root@centos-test:/# export ZPOOL_VDEV_NAME_PATH=YES

    ** Generate a new GRUB config **

    root@centos-test:/# grub2-mkconfig -o /boot/grub2/grub.cfg

    Generating grub configuration file …
    Found linux image: /boot/vmlinuz-3.10.0-514.26.2.el7.x86_64
    Found initrd image: /boot/initramfs-3.10.0-514.26.2.el7.x86_64.img
    Found linux image: /boot/vmlinuz-0-rescue-71ce3f23d5324e69aba211b4405fbf4c
    Found initrd image: /boot/initramfs-0-rescue-71ce3f23d5324e69aba211b4405fbf4c.img
    Found linux image: /boot/vmlinuz-0-rescue-27a3968f98aa4670a8ce5e4c952d8f77
    Found initrd image: /boot/initramfs-0-rescue-27a3968f98aa4670a8ce5e4c952d8f77.img
    done

    ** Install bootloader on ZFS drive **

    root@centos-test:/# grub2-install /dev/sdb
    Installing for i386-pc platform.
    Installation finished. No error reported.


  14. That’s all, if everything went well you can now exit chroot environment, unmount ZFS datasets, export the pool and finally shutdown the machine and use ZFS disk as the boot disk (old hard drive can be unplugged).

    root@centos-test#: exit
    root@ubuntu#: umount -R /mnt
    root@ubuntu#: zpool export rpool
    root@ubuntu#: poweroff

  15. Boot from ZFS disk and see if it works, good luck!

Reinstalling GRUB on a non bootable UEFI Ubuntu 16.04 ZFS installation

You can use the steps below to reinstall grub on a Ubuntu 16.04 ROOT on ZFS  installation.

Step 1: Prepare The Install Environment

1.1 Boot the Ubuntu Live CD, select Try Ubuntu Without Installing, and open a terminal (press Ctrl-Alt-T).

1.2 Optional: Install the OpenSSH server in the Live CD environment: If you have a second system, using SSH to access the target system can be convenient.

$ sudo apt-get --yes install openssh-server

Set a password on the “ubuntu” (Live CD user) account:

$ passwd

Hint: You can find your IP address with ip addr show scope global. Then, from your main machine, connect with ssh ubuntu@IP.

1.3 Become root:

# sudo -i

1.4 Install ZFS in the Live CD environment:

# apt-add-repository universe
# apt update

(ignore errors about moving an old database out of the way)

# apt install --yes debootstrap gdisk zfs-initramfs grub-efi-amd64

Step 2: Discover available ZFS pools

2.1 check if ZFS pools are already imported

# zpool list
# zfs list 

2.2 if so, we need to export the zfs pool so we can mount it in a different directory so we can chroot to it

# zpool export rpool

Step 3: Chroot into ZFS pool

3.1 import pool to non-default location. The -N flag (don’t automatically mount) is necessary because otherwise the rpool root, and the rpool/root/UBUNTU pool, will both try to mount on /mnt

# zpool import -a -N -R /mnt

3.2 mount the root system

# zfs mount rpool/ROOT/ubuntu

3.3 mount the remaining file systems

# zfs mount -a

3.4 Bind the virtual filesystems from the LiveCD environment to the new system and chroot into it:

# mount --rbind /dev  /mnt/dev
# mount --rbind /proc /mnt/proc
# mount --rbind /sys  /mnt/sys
# chroot /mnt /bin/bash --login

Note: This is using –rbind, not —bind.

Step 4: Re-initialise EFI partitions on all root pool components

4.1 Check the wildcard gets the correct root pool partitions:

# for i in /dev/disk/by-id/*ata*part3; do echo $i; done

4.2 Add an entry for /boot/efi for each disk to /etc/fstab for failover purposes in future:

# for i in /dev/disk/by-id/*ata*part3; \
      do mkdosfs -F 32 -n EFI ${i}; \
      echo PARTUUID=$(blkid -s PARTUUID -o value \
      ${i}) /boot/efi vfat defaults 0 1 >> /etc/fstab; done

4.3 mount the first disk

# mount /dev/disk/by-id/scsi-SATA_disk1-part3 /boot/efi

4.4 install grub

# grub-install --target=x86_64-efi --efi-directory=/boot/efi \
      --bootloader-id=ubuntu --recheck —no-floppy

4.5 unmount the first partition

# umount /boot/efi

4.6 mount the second disk

# mount /dev/disk/by-id/scsi-SATA_disk2-part3 /boot/efi

4.7 install grub

# grub-install --target=x86_64-efi --efi-directory=/boot/efi \
      --bootloader-id=ubuntu --recheck —no-floppy

4.8 repeat steps 4.5 to 4.7 for each additional disk 4.9 For added insurance, do an MBR installation to each disk too

# grub-install /dev/disk/by-id/scsi-SATA_disk1
# grub-install /dev/disk/by-id/scsi-SATA_disk2

Step 5: Reboot

5.1 Quit from the chroot

# exit

5.2 Reboot

# reboot

Using DRBD block level replication on Windows

WDRBD or Windows DRBD

DRBD is a well known distributed replicated storage system for Linux. Recently a company has ported DRBD kernel driver and userspace utilities on Windows, so it’s now possible to setup DRBD resources on a Windows machine. DRBD is block level storage replication system  (similar to RAID-1) used on highly available storage setups. You can use both Desktop and Server Windows O/S, but it’s recommended  to use a Server version if this is intended for production use.

What you will need:
– 2 x Windows Server machines (Win2012 R2 in my case)
– DRBD binaries from here
– A dedicated volume (disk) to be replicated by DRBD. You can also use a NTFS volume, with existing data. You can use this method to replicate for example an existing Windows file server on a second Windows server. However, in this case you will need to resize (shrink) server’s partition in order to create a second, small partition needed for DRBD meta-data.
– Optionally a dedicated network for DRBD replication.

Configuration:

You must follow these steps on both nodes.

– Setup both Windows machines with static IP addresses. In my case I will use 10.10.10.1 for node1 and 10.10.10.2 for node2. Also provide a meaningful hostname on each server since you will need this during DRBD configuration. In my case node1: wdrbd1 and node2: wdrbd2 .
– Install DRBD binaries by double clicking on setup file and following the wizard. Finally reboot both servers.
– Navigate to “Program Files\drbd\etc” and “Program Files\drbd\etc\drbd.d”  folder and rename (or create a copy) the following files:

drbd.conf.sample –> drbd.conf
   global_common.conf.sample –> global_common.conf

(Note: For this test we do not need to modify the content of the above files. However it may be needed to do so in different scenarios.)

– Create a resource config file in “Program Files\drbd\etc\drbd.d”

r0.res (you can copy the contents from the existing sample config file)

A simple resource config file should look like this:

resource r0 {
      on wdrbd1 {
            device          e   minor 2;
            disk            e;
            meta-disk       f;
            address      10.10.10.1:7789;
      }

      on wdrbd2 {
              device        e   minor 2;
              disk          e;
              meta-disk     f;
              address       10.10.10.2:7789;
    }
}

“minor 2” means volume index number. (c: volume is minor 0, d: volume is minor 1, and e: is minor 2).

– Partition the hard drive for DRBD use. In my case I have a dedicated 40GB disk to be used for DRBD replication. I will use Disk Management to partition/format the hard drive.
I will need 2 partitions, 1st partition will be the data partition(device e above) and 2nd partition will be the meta-data partition(device f above). So let’s create the partition 1 and format it in NTFS.The size of this partition (e) in my case will be 39.68GB. The rest of free space will be dedicated for the meta-data partition (f), 200MB. To calculate the size of the meta-data properly please use the following link from Linbit DRBD documentation site.
The disk layout should look like this:
Please note that the data partition (E:) has a filesystem, NTFS,  but the meta-data partition (F:) does not, so it must be a RAW partition.

– Once finished with the above on both nodes, open a command prompt (as an Administrator)  and use the following commands to prepare DRBD:

  •  drbdadm create-md r0    (on each nodes)
Initial Sync
  • drbdadm up r0   (on node1)
  • drbdadm up r0   (on node2)
  • drbdadm status r0  (on node1)

You should see something like the following:

C:\Users\wdrbd>drbdadm status r0
  r0 role:Secondary
    disk:Inconsistent
    wdrbd2 role:Secondary
        peer-disk:Inconsistent

Initiate a full sync on node1:

  • drbdadm primary –force r0

After the sync is completed you should get the following:

C:\Users\wdrbd>drbdadm status r0
  r0 role:Primary
    disk:UpToDate
    wdrbd2 role:Secondary
          peer-disk:UpToDate

The disk state on both nodes should be in UpToDate state. As you can see the primary node in this case it’s node1. This means that node1 is the only node which can access the E: drive to read/write data into it. Remember that NTFS is not a clustered file system, meaning that it cannot be opened for read/write access concurrently on both nodes. Our DRBD configuration in our scenario prevents dual Primary mode in order to avoid corruption of the file system.

Switching the roles:

If you want to make node2 the Primary and node1 the Secondary, you can do so by doing the following (make sure there are no any active read/write sessions on node1 since DRBD will have to force close them):

  • On node1: drbdadm secondary r0
  • On node2: drbdadm primary r0

After issuing the above commands, you should be able to access the E: drive on node2 this time. Feel free to experiment and don’t forget to report any bugs to the project’s github web page!