Server Recovery – Captain's Log Supplemental

There are rare occasions when I am glad to be both smarter than the average computer user, and a touch paranoid. Today has proven to be one of those times.

Earlier today, my file server’s RAID enclosure managed to take a ThinkPad to the face, and this lead to a great circle of profanity upon the discovery that said server was no longer seeing a disk label. Turns out that managed to nudge the mode switch (which of course some arsehole put on the front) and depress the power button sufficiently to switch modes to combined disks. Of course, switching back wiped out the metadata and so on.

But because I’m a right pain in the ass myself and the first reaction to going from a Master / Backup drive pair to a RAID 1 redundancy was roughly, “Ahh shit, now I need a third drive for the backups,” I only lost data since yesterday at about 0102 UTC when I backed up the array to my NVMe drive. Which largely amounts to having to re-upload some recent files to the server’s Music share, rather than 100 GB of family photos that aren’t offsite so frequently.

Being the anal retentive pain in the ass that I am, the restore process is even relatively simple for the file shares since it’s roughly reformat drive, run script for each share, copy files for each share, verify permissions / access control lists / ownership / contexts / yada, yada. I’m too paranoid not to already know that the backup procedure will work, because how the fuck would I have migrated the data the first time? 😁

The catch? Well, the virtual machines weren’t backed up but were being stored on the array. It’s been on my todo list to study the best way to handle backing them up automatically. Only one virtual machine actually had any local data of consequence, and was the authoritative name server for my LAN’s domain. Except I kind of don’t need to worry about that for three reasons:

Name servers two and three are configured so that either can be converted to take over the job with a minimal fuss.
Their topology was chosen so only resolving local domains would fail if name server one fails longer than the pair serving my LAN caches.
Name servers one, two, and three are each automatically backed up every night to, you guessed it, the file server!

Which means name server one’s sudden demise fits into the “important but not urgent” quadrant of my Eisenhower matrix, and affords cause to revisit the issue of how the VM’s should be managed on Zeta.

Also while I’m at it, I’ve repositioned Zeta’s RAID enclosure to make it much harder for anything to hit that fucking button and switch. I might build a proper safety cover just to be extra paranoid, lol.