Scaling Done Right

Before I undocked Shion, I had a desktop full of stuff, roughly in the form of three windows across the top of the display, two across the bottom, with a mini-player of music in the corner.

While I was undocked, I was mostly doing other things.

Coming upstairs and re-docking, I’m pleasantly happy that everything is effectively where I left it before transitioning from a 32″/2160p screen to my laptop’s 13.6″/1664p screen.

If anything, this is one of the reasons I’ve come to prefer one big-ass monitor over two proper sized monitors, and have come to appreciate macOS’s scaling methods being sane. That is to say, it’s not like going a dinner platter to a postage stamp, so much as roomy to cozy and back again. A far cry, from for example, eons ago having the problems of shifting a PC between 600p, 768p, and 1200p screens causing tons of ruckus and disorder.

Laptops > Desktops

Working on a screen full of files, listening to music, yada, yada when I remember I should’ve started cooking half an hour previously, I’m reminded of one of the reasons I prefer laptops for things that aren’t rack mount friendly.

The big juicy monitor ™ provides a nice 32″ workspace, the thunderbolt docking station nets me my keyboard/mouse/etc and a gigabit link straight to my file server within two hops. Heck, I even like the speakers 😀.

Yet just the same, it’s rather convienant to be able to undock, grab my laptop and bring it downstairs. Since the speaker’s and battery life aren’t shit, it’s an easy matter to have my music continue in the background while I’m cooking, and then pick up where I left off for a bit while I’m waiting for the oven to finish it’s share of cooking duty.

Increasingly, I’m inclined to believe that owning a desktop will fade by the wade side. The big honking GPU is the key reason that I still own one, since the need for expansion cards and reconfigurable internal drives have become less necessary as more compact form factors have become more capable and external connectivity have become faster. Plus, despite my early interest in diskless virtualized workstations and remote desktops, nothing really beats a good client machine for doing client machine tasks just like nothing really beats a terminal for doing terminal oriented tasks.

Maybe I’m just getting old 😛

Apparently, one of the reasons Steam Deck’s underlaying technology owes to Nier Automata if the itnerviewlets at Proton and Tier: Automata – the unique story behind what makes Steam Deck tick, are to be believed. Which really doesn’t surprise me.

Steam Deck’s graphics and battery life in my opinion aren’t as impressive as achieving them in such a small, portable package. You get roughly Xbox One grade graphics from roughly Xbox One grade hardware, and x86 will never offer great battery life under heavy load. But it’s got one thing I love most of all.

Video games work on it. There’s a fair bit of video games on Steam that actually have a native Linux version, and unlike the support for macOS, it’s not quite a joke. But the vast majority of games are Direct3D based games for Windows that require DirectX. That’s how video games are written in this world.

Yet, Steam Deck runs them well as the hardware is capable. In ways that I was never able to achieve back in the day, now more than a decade in the past, using purely Wine and derivative solutions. So I find myself very glad now that folks made a video game with 2B and 9S 🙂

Actually, that reminds me: I’ve been debating picking up a copy of the game on Steam one of these sales. Haven’t played it since I was active on console, and I haven’t even bothered to hook up Deathstar One since moving thanks to getting Rimuru operational and Steam Deck largely taking over for both the ol’ Steam Link and Deathstar One.

TLS all the things

Passing thought, if I’m willing to go through the bollocks of setting up a bunch of name servers and probably rolling a DHCP host or two, I should investigate how possible it would be to run an ACME based setup on a private network; Ala auto renew your own self signed certificates.

Yes, yes, I know I’m a pain in the ass 😝

Home network version 7

Recently, I decommissioned my ASUS RT-AC68, A.K.A. the single best piece of networking equipment that I have ever owned (and at this point, the chief contender is a null modem cable). <S> for your decade of service little guy!

Based on the ASUS’s iPhone 4 grade horsepower and almost fast enough speed of Wi-Fi 5 having lasted so long, the key choice in replacing things had two major criteria:

  1. Wi-Fi 7 (802.11be) support.
  2. Mesh network.

The first criteria is driven by the fact that if I’m doing it now as Wi-Fi 7 goes gold or within a few years as more equipment becomes available, it makes no sense to select outgoing Wi-Fi 6[E] equipment — when I plan on using this gear until the tires fall off. Literally, my only complaint about the old guy was the Wi-Fi reception between hosts situated on the exact opposite end of the house from where the modem is located. Yup. Shion, Rimuru, Zeta, Deck, etc are in the same room and all have wireless connectivity but since the access point is at the furtherest distance possible that means local communication suxor. I.e., downloading from Steam == awesome because ASUS AC68, mother fucker; but I/O between Zeta and client machines != so great.

Which generated the second criteria. My segregated IPv6 network on the desk works pretty well, not that Zeta was intended to be functioning as a router among its myriad of other tasks. This makes for some things that are just inconvenient like running service VMs attached to Ethernet, and then wanting to access them in other rooms over wireless. Not to mention dealing with the split domain problem.

Enter the successor: Eero Max 7! In addition to having the chutzpah to compete with the decade old ASUS (the Ferrari of wireless networking back then), each node comes with a pair of 2.5G and 10G Ethernet ports (and a credit card bill😂), which future proofs it in a world where gigabit is becoming too slow. In theory, the system should last until Wi-Fi 7 is the new 802.11g (Wi-Fi 3), or Eero stops working. ASUS was still delivering firmware updates a decade later, which was crazy but appreciated.

Using one mesh node to function as a gateway and pump out signal at that end of the house, and another node situated in my study: this effectively solves the division of networks. My desk’s separate IPv6 network is now demissioned on the software side (Zeta doesn’t mind, lol) and its physical is now a gigabit switch to the local Eeero node. Zeta has a direct connection. But with most of my devices being Wi-Fi 5 on the 5 capable, that should be less of a concern.

So now wireless is pumping out my modem speeds instead of up to 1/3″ when sitting at this edge of the solar system. The old guy could bring the signal like a champion, but Wi-Fi 5 is only when wireless started to deliver “Fast enough” to compete with wired and we’re literally hitting the edge cases 😛.

There is only one real problem with how Eero is a “Basics only” approach to network configuration. See, I’m a DNS kind of guy. I’m not typing 128-bit fucking addresses, and you can take your 32-bit IPv4 addresses and shove them up a post note. Typing IP addresses does not scale when you have more devices than room on a post it note.

Aside from its awesomeness as a wireless router, there is one superb thing that the ASUS AC68 did that made life great. Like many a router, it let you specify a domain for the gateway and defaulted to its own caching DNS resolver. That’s common enough. But it went a step further. The DHCP and DNS was auto stitched together so that modern DHCP clients led to

That is to say, “zeta.home” would just work by setting my server’s hostname to “zeta” and connecting it to DHCP. No need to give a flying fuck about manually configuring a DHCP reservation or even what the IP assignment was, although I used to do reservations for infrastructure as an ‘insurance’ policy.

Then enter Eero where the dealio is: “We don’t care if you give us money, that’s not our problem!”

Which means in solving my routing troubles, I went from the annoyance of wishing I could maintain separate A / AAAA records to “What the fuck, is this the darkages?” which is not a problem that I appreciate, but it is a problem that I can solve easier than running an Ethernet drop across the attic space.

Phase one of this solution was to create a new virtual machine on Zeta, taking advantage of the fact that part of replacing Cream was wanting a system that could function as a VM or container farm. Easy peasy, lemon squeezy it’s an authoritive DNS server with control over “home.arpa.” and functioning reverse DNS because I’m a pain in the ass who doesnt like to do things by halves. That wasn’t so bad, give or take having to remember the fun that is editing zone files. I’ve used .home for a long time now, but figured that I may as well migrate to the .home.arpa convention that replaced it if I’m doing all this crud.

Phase two of this solution was to create a second instance. See, most clients will just send their queries to the first DNS and use the second as a failback. People often think, “Hey, I’ll just put my zones on this and let the other DNS do the other stuff,” and then wonder why nothing works the moment a client starts sending queries to another resolver. There’s also reasons why DNS always comes in at least pairs! Hell, the Internet was made to take a nukin’ and keep on truckin’ so survivability is a thing.

But obviously it’s a bad idea to configure another VM on the same server, or Zeta itself as the second DNS. Plus from a paranoid perspective, it would be kind of nice to put the “Ahh, I’m working on Zeta” safe guard across the building where the gateway is. Thus phase two takes a Raspberry Pi Zero W that’s been waiting on me to solder an RaSCSI for, and turns the machine into a secondary DNS server for home.arpa. For extra abuse the primary and secondary nameservers run on different operating systems with one being run natively on the Raspberry Pi’s OS and the other being a virtualized Red Hat instance on the central server.

Then enter phase 3! Being able to resolve external DNS (e.g., Google), zone transfers, reconfiguring the Eero for custom DNS, being happy not to have misstyped the IPv6 addresses, security wrangling, and testing fail over scenarios. Not to mention documenting the key details in my notes system being I’m that kind of pain in the ass.

The next sticking point however is where the magic happens. See, it’s not rocket science to have a DHCP and DNS server cooperating for the hostname -> hostname.domainname magic. If you’re using dhcpd and bind, the harder trick is knowing that you can actually do that.

But as far as I can tell, Eero’s software doesn’t support a separate DHCP server without running in bridge mode, and that would bring my AC68 out of retirement, so for right now I have zone files configured for key systems only. Phase 4 will likely be to address that after the system gets more stress testing.

It seems that Eeero uses DHCP for IPv4 and IPv6 clients are left to SLAAC, which is great IMHO. I’m all for that because between Stateless Address Auto Configuration and Neighbor Discovery features, you can pretty much just say fuck it and IPv6 hosts will do the right things unless your network is stupid(tm). Unlike an IT department, most of us don’t need to log every single precipice of our network’s activity and aren’t paranoid enough to want to do that at home.

Possible solutions may be to configurize DHCPv6 and ignore IPv4, or see if the good old respect meh authority trick would get the Eero to delegating DHCP to a dedicated server under my control without having to wrestle with the Eero trying to run its own dhcpd, or getting creative with firewalling.

Other than an as an alternative to Names I don’t really care about IPv4 locally anymore, since the things I have that require IPv4 are in the same club of things that know what Apple Talk was–that is to say, equipment so old that it passed old enough to buy beer and entered the old enough to have kids in school vintage of computer hardware.

But in any case, the DHCP portion of things shall be a battle for another time.

CMOS Reminder

Best way to remember that Stark’s CMOS battery needs replacing: plugin, let’em charge, boot up, oh hey BitLocker.

On the upside being anal retentive about such things, it was more of a pain in the arse to input my recovery key and decrypt the system drive than to actually find where I had encrypted that 😁

Normalization ftw

There’s several upsides on standardizing on cables and devices when possible. In my case, that’s been braided (i.e., tangle free) USB-C cables rated for 100W charging when the cables are long and comparable 10 Gbit/s or faster rated cables when they’re short.

One of these upsides is “Ahh, it’ll charge a laptop!” when paring a suitable charger with any of my longer cables. These cables are usually poor on data speed but superb at power delivery, which is often what I want when the desired cable is measured in meters, which is also when I really want tangle free….lol.

Another is knowing that when I grab a smaller cable, it’s going to be good enough to feed I/O devices like a NVMe based SSD or any SATA thing I’ve still got handy. Aptly, most of these short cables either came with NVMe enclosures rated for 10 Gbit/s USB connectivity or are in fact Thunderbolt 3/4 cables rated for both 40 Gbit/s connectivity and 100W charging.

Increasingly, when the cables are short I’m aiming for 40 Gbit/s + 100W unless they’re packaged with something. The downside is that Thunderbolt cables are costly and have limited cable lengths, but generally are sufficient for ‘all the USB things’ once you’ve groaned at the bill. If I find myself buying a short cable these days, I’ll save up for a Thunderbolt for future proofing because more and more of my devices support either Thunderbolt or USB at 40 Gbit/s.

For devices in general, I’ve been swinging for USB-C 10 Gbit/s for a while now. Things like motherboards, drive enclosures and external drives, USB hubs and PCI-E expansion cards are chosen based on this. This choice was made based on the rise of the NVMe external drive, and the fact that such a cable will be no problemo when pared with my older gear that maxes out at USB 3.0 or SATA speeds.

Similarly for chargers, the rare time that I buy a charger, I’ve generally aimed for the 90~100W scenario. In the sense that most of my devices will happily charge from a 45W or 65W charger, and the hungriest ship with a 90W charger.

Is this excessive? Not really. Why? Well, let’s see… my primary machine has 40G ports, my gaming machine has 10G ports and a card with 40G ports. SteamDeck has a 10G port and my file server has an expansion card with 10G ports.

Much like USB-A and MicroUSB-B has become relegated to specialized and rare things around here over the past decade, so has 5 Gbit/s connectivity begun to age out of the herd ;).

Network device names are meaningless

Annoying factoid, the modern naming of Linux network interface cards provides a consistent way of naming the devices. But not a permanent one.

Taking advantage of Rimuru’s old dual port 10G USB-C card that I replaced with a Thunderbolt 4 card, happily works. But as a side effect this means that enp3s0 and wlp4s0 are now enp4s0 and wlp5s0, which as you might expect breaks the networking configuration for both interfaces.

Because why the fuck would you expect devices to retain their topology just because they happened to be soldered to the board? On the upside at least it became obvious what was going on when I inspected the files in /etc/NetworkManager/system-connections and noticed that the digits had changed.

I’m guessing that since Zeta’s lone PCI-E slot is an x16, it ends up numero uno. It just so happens that I have an PCI-E x4 card plugged in with a USB controller instead of a GPU, because the machine’s jobs are all server related. Although, I bet she would make a dandy little gaming box in as much as a 2-slot wide GPU and a SFX PSU can actually handle anything modern and juicy. Especially if Valve was to you know, release StreamOS 3.x for PC instead of Steam Deck only 🙂

lspci -t -v
-[0000:00]-+-00.0  Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne Root Complex
           +-00.2  Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne IOMMU
           +-01.0  Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
           +-01.1-[01]----00.0  ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller
           +-02.0  Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
           +-02.1-[02-05]--+-00.0  Advanced Micro Devices, Inc. [AMD] 500 Series Chipset USB 3.1 XHCI Controller
           |               +-00.1  Advanced Micro Devices, Inc. [AMD] 500 Series Chipset SATA Controller
           |               \-00.2-[03-05]--+-00.0-[04]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           |                               \-01.0-[05]----00.0  Intel Corporation Dual Band Wireless-AC 3168NGW [Stone Peak]
           +-08.0  Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
           +-08.1-[06]--+-00.0  Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series]
           |            +-00.1  Advanced Micro Devices, Inc. [AMD/ATI] Renoir Radeon High Definition Audio Controller
           |            +-00.2  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor
           |            +-00.3  Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1
           |            +-00.4  Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1
           |            \-00.6  Advanced Micro Devices, Inc. [AMD] Family 17h/19h HD Audio Controller
           +-14.0  Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
           +-14.3  Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
           +-18.0  Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 0
           +-18.1  Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 1
           +-18.2  Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 2
           +-18.3  Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 3
           +-18.4  Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 4
           +-18.5  Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 5
           +-18.6  Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 6
           \-18.7  Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 7

Backups, backups, backups

Now that Zeta is effectively operational, I’ve turned my master plan to its next stage: not losing data.

Cream’s drives had a very simple arrangement. One drive, designated “Master”, or M:, was the base of all the file shares kept there. A second drive, “Backups”, or Z:, was kept next to it, and a scheduled task would run a robocopy in mirror mode three times a week to sync the drives up. Nice and simple and cheesy, and for bonus points the mirrored drive racked up about half the power on hours over the past few years. For monitoring purposes, the log was saved to one of Cream’s internal drives which was periodically imaged to the host specific backups area on the master drive.

Zeta on the other hand doesn’t have the curse of NT, but I did kind of like this simplicity. It fits my recovery model where having an easily recovered copy of data is desirable, but changes are infrequent enough that rolling up the backups every 2 or 3 days is probably okay. At first, I decided to create one disk as ext4 to function as the backups, because dependable and trusted, while making the other xfs to function as the master, because that’s the default in AlmaLinux 9.

This created one small problem however, in that getting rsync to play nice with the SELinux, POSIX ACLs, and a few extended attributes proved to be a pain in my ass! For SELinux, you can just relabel the drive after. Not something I want to scale up to 8 TB but not too bad for the actual storage use (2 TB) today. But then we’ve got the issue of the POSIX ACLs and extended attributes used on my file share infrastructure.

Turns out that rsync’s --archive flag effectively breaks the flags you would want for synchronizing these, and then leaves you to go fiddle around with permission masks. So, I said fuck that. I was rather disappointed in rsync at that, but let’s face it, acls and xattrs aren’t that popular when 1970s unix permissions are an 80% solution.

After taking suitable backups (one local, one remote) of the critical files, I set about turning to tools that I know how to fuck with. The backup drive was sacrificed to create one disk of a RAID1 Mirror, and since madam allows specifying the drives like missing /dev/sdwhatever or vice versa, it was easy to spawn the array in degraded form. Then sync the data to the array from the master drive, before wiping and adding it to the mirror’s missing slot. About 10 or 11 hours later syncing at max speed, everyone’s all riled up and gone through the reboot test.

How did I migrate the data if rsync was being a bugger, you ask? Well, it’s slow as hell, but cp --archive and tar --acls --selinux --xattrs really does do what you want when you’re Rooty Tooty and want a lossless copy :P.

In the past, I would typically have used LVM2 pools to manage this sort of operation. It’s overly complicated command line administrata, but hey, it works well and it has features I like, such as snapshots and storage pools. The advantage for me of mdadm is that it is very simple to manage thanks to fewer moving parts.

Having been “That guy” at some point in my career who ended up writing the management software my old job used for mdadm software raid in their audio IRDs, and then later extended to custom hardware built ontop of firmware raid, I know how to use mdadm and more importantly, how reliable it is — and how easy it is to recover a mirror without fucking up. Which, you know, is like the number one way your data goes bye, bye when recovering, right next to oh shit the drive died before it was synced. As much as I appreciate LVM2, it’s got enough moving parts that I’m more leery about the failure scenarios. More importantly, I have more experience with mdadm failure and recovery than I do with LVM.

Of course, this does create a new problem and its own solution. Since my backup drive is now in hot sync with the master drive, it is no longer uber idle enough to be considered a ‘backup’. No, it’s redundancy to buy time to replace drives before the entire array goes to the scrap yard.

This doesn’t really change my original recovery scenario: which is “Go buy two drives if one fails,” it just means that there is a higher probability that both drives will actually fail closer together when that happens. What’s the solution to this? Why, my favorite rule of data storage: ALWAYS HAVE A BACKUP! Thus, a third drive will be entering the picture upon which to do periodic backups of the array, and be kept separate and offline when not being refreshed.

In practice though, this will be more like a fourth drive; in the sense of ‘smaller disk, most important data’ and ‘big ass disk, all the data’. My spare archive drives are large enough to easily do the former, and one can basically contain the entire ‘in use’ storage or close to it, but none of my spares for sporadic backup has the capacity to handle the entire array.