Replacing a disk in a BTRFS RAID1 configuration

When your hard drive starts making the clicking sound of death, it’s time to act.

  1. Get a new hard drive.
  2. Run a BTRFS scrub, using btrfs scrub status -d /home to show which drive had the errors (and also to ensure that any errors were corrected on the good disk)
  3. Work out which hard drive is at fault. I had the following information:
    • /dev/mapper/home2 from BTRFS - this one had errors during a scrub, too.
    • ata2 from the kernel message log - the ATA interface kept getting reset.
    • /dev/sdb from GSmartControl - this was the drive with reallocated sectors.
    • Verify using blkid that the UUID for /dev/sdb was the same as the UUID for home2 in /etc/crypttab.
  4. Locate the physical drive. I was able to do this by following SATA cables to the various drives in the system and figuring out the ATA IDs for each SATA port.
  5. Put the new drive into the DVD drive position. (This is probably easier than having to replace a physically removed drive).
  6. Partition the new drive. Ensure you use the same tools, or at least end up with exactly the same partition table. That bit me.
  7. Set up crypto on the new drive
  8. Open the encrypted partition with a different name (e.g. home3). You can end up using the original name, but not while the original drive is mounted.
  9. Add the new drive to the BTRFS RAID array by replacing the old one.
    • Execute btrfs filesystem show /home and find the devid.
    • btrfs replace start <devid> /dev/mapper/home3 /home
    • btrfs replace status /home and watch it progress.
  10. While that’s going, write a blog entry and also update /etc/crypttab and /etc/fstab to use the new drive. Probably should use the nofail mount option in case you make a mistake, so the system still boots into a nice environment.
  11. When it’s all done, remove the old drive.

Posted Friday, June 18, 2021

Blog contents