Homelab Chronicles 06 – “Hey Google…” “I’m Sorry, Something Went Wrong”

I woke up early today, on a Saturday, to my alarm clock(s) going off. I was planning to go to a St. Patrick’s Day Parade and post-parade party with a friend. After turning off my phone alarm(s), I told my Google Nest Mini to stop the alarm that was blaring.

Unfortunately, it informed me that something went wrong. Though it did turn off. Usually when my Google Nest Mini has issues, it’s because WiFi messed up. So I stumbled out of bed, still half-asleep, to the guest bedroom, where the network “rack”—a small metal bookshelf—and the Unifi AP was at. My main 24-port switch had lights blinking. I looked up at the AP high up on the wall and saw the steady ring of blue light, indicating everything was working. OK, so not a WiFi problem, nor a network problem. Probably.

In the hallway, I passed by my Ecobee thermostat to turn the heat up a little and then noticed a “?” mark on the button for local weather. Ah, so I didn’t have Internet. Back in my room and I picked up my phone: 5G, instead of WiFi. On my computer, the Formula 1 livestream of the Bahrain track test, which I fell asleep to, had stopped. And reloading the page simply displayed a “No connection” error. I opened a command prompt and ran ipconfig /all and ping 8.8.8.8. The ping didn’t go anywhere, but I still had a proper internal IP in the subnet. Interesting. Guess the DHCP lease was still good.

Only one last place to check: the living room where the Google Fiber Jack and my Unifi Secure Gateway router were. Maybe there was a Fiber outage. Or maybe my cat had accidentally knocked the AC adapter off messing around in places he shouldn’t. Sunlight was streaming in from the balcony sliding door, making it hard to see the LED on the Jack. I covered the LED on the Fiber Jack with my hands as best as I could: it was blue. Which meant this wasn’t an outage. Uh oh. Only one other thing it could be.

Next to the Fiber Jack, surrounding my TV, I have some shelving with knickknacks and little bits of artwork. Hidden behind one art piece is my USG and an 8-port switch. I removed the art to see the devices. The switch was blinking normally. But on the USG, the console light was blinking with periodicity, while the WAN and LAN lights were out. Oh no, please don’t tell me the “magic smoke” escaped from the USG.

On closer inspection, it looked like the USG was trying to boot up repeatedly. It was even making a weird sound like a little yelp in time with the console LED going on and off. So I traced the power cable to the power strip and unplugged it, waited 15 seconds, and plugged it in again. Same thing happened. I really didn’t want to have to buy a new USG; they’re not terribly expensive, but they’re not inexpensive, either.

I tried plugging it into a different outlet on the power strip, but it kept quickly boot-looping. I then brought it to a different room and plugged it into a power outlet; no change. Great.

But then I noticed that there was a little green LED on the power brick. And it was flashing at the same frequency as the USG’s console light when plugged in. Hmm, maybe the power adapter went bad. I could deal with that, provided I had a spare lying around.

The Unifi power brick said “12V, 1 amp” for the output. So I started looking around. On my rack, I had an external HDD that was cold. I looked at its AC adapter and saw “12V, 2 amps.” That was promising, but could I use a 2 amp power supply on a device that only wants 1 amp? I looked online, via my phone, and the Internet said, “Yes.” Perfect.

I swapped the AC adapter on the USG. The little barrel connector that goes into the USG seemed to fit, if not just a smidge loose. Then I plugged it back into the wall.

It turned on and stayed on! Ha!

I brought it back to the shelf and reconnected everything. It took about 5 minutes for it to fully boot up. Afterwards, I went back to my computer and waited for an Internet connection to come back, and it did.

All in all, it was a 15-20 minute troubleshooting adventure. Not what I preferred to do straight out of bed on a Saturday morning, but it got fixed. I already ordered a new AC adapter from Amazon that should arrive in a few days.

Afterwards, I got ready and went to the parade. A bit nippy at about 25°F (about -3°C), but at least it was bright and sunny with barely any wind. I went to the party and had a couple beers. It definitely made up for the morning IT sesh.

Homelab Chronicles 05: Dead Disk Drive

Sometime around the new year, one of the drives in my server appeared to have died. I had some issues with it in the past, but usually unseating and reseating it seemed to fix whatever problems it was presenting. But not this time.

My server has 7 500GB HDDs, set-up in a RAID 5 configuration. It gives me about 2.7TB of storage space. These are just consumer-level WD Blue 7200rpm drives. It’s a Homelab that’s mainly experimental; I’m not into spending big money on it. Not yet, anyway.

While I’ve since heard that RAID 5 isn’t great, I’m OK with this since this is just a Homelab. Anyway, in RAID5, one drive can die and the array will still function. Which is exactly what happened here.

However, I began tempting fate by not immediately swapping the failed drive. I didn’t have any spares at home, but more importantly, I was being cheap. So I let it run in a degraded state for a month or two months. This was very dangerous as I don’t backup the VMs or ESXi. I only backup my main Windows Server instance via Windows Server backup to an external HDD. Even then, I’ve committed the common cardinal sin of backups: I’ve yet to test a single WS backup. So using something like Veeam is probably worth looking into for backing up full VMs. And of course testing my Windows Server full bare-metal backups.

Luckily, fates were on my side and no other drive failures were reported. I finally got around to replacing the drive about month ago. Got my hands on a similar WD Blue 500GB drive; a used one at that. It was pretty straightforward. I swapped the drives, went into the RAID configuration in the system BIOS, designated the drive as part of the array, and then had it rebuild. I think it took at least 10hrs.

While it was rebuilding, everything else was down. All VMs were down, because ESXi was down. Thought it best to rebuild while nothing else was happening. Who knows how long it would’ve taken otherwise and if I’d run into other issues. I wonder how this is done on real-life production servers.

Afterwards, the RAID controller reported that everything was in tip-top shape.

But of course, I wanted more. More storage, that is. I ended up getting two of the 500GB WD Blue HDDs: one for the replacement and the other as an additional disk to the array.

Unfortunately, Dell does not make it easy to add additional drives to an existing array. I couldn’t do it directly on the RAID controller (pressing Ctrl+R during boot), nor in Dell’s GUI-based BIOS or Lifecycle Controller. IDRAC didn’t allow it either.

Looking around online, it seemed that the only way to do it would be via something called OpenManage, some kind of remote system controller from Dell. But I couldn’t get it to work no matter what I did. The instructions on what I needed to install, how to install it once I figured out what to install, nor how to actually use it once I determined how to install it, were poor. Thanks, Dell.

In the end, after spending at least a few hours researching and experimenting, it didn’t seem worth it for 500GB more of storage space. I did add the 8th drive in, but as a hot spare. I may even take it out and use it as cold spare.

But yeah, I can now say that I’ve dealt with a failed drive in a RAID configuration. Hopefully it never goes further than that.

Homelab Chronicles 04 – Power Outages and Conditional Forwarding on USG

I’m typing this from my new digs. “New” is relative; I’ve been here three months already, yet still living out of boxes to an extent. Though all the important stuff is up and running like my computers, the network, the TV, and my bed.

My network diagram needs to be re-done, as I’ve had to move switches and routers around to make the physical infrastructure work for this apartment. The Google Fiber jack is in the living room, but some computers and network equipment are in bedrooms. Logically speaking, however, the network is still the same.

Main difference is that I have more lengths of cable running along the carpet than I did before, so I’ve had to secure the Ethernet cables to the baseboards so that I or my cat don’t trip over it. It actually looks pretty good!

Cat 5e cable cleanly secured to baseboard

As part of getting a new place, I did some additional home automation upgrades. My electric company was offering free Smart Thermostats, so I took advantage. I also replaced and added additional TP-Link Kasa Smart Plugs to control lamps around my apartment.

However, a peculiar situation arose when the power went out briefly a couple times from a bad storm. After everything came back on and online, the smart plugs stopped working properly. Only a hard power cycle—literally unplugging and re-plugging in the smart plug—seemed to fix it.

I won’t go into the whole ordeal, but after asking around on reddit, someone suggested the solution possibly lay with DNS. Because of course it’s always DNS.

DC01, a VM on my server, is the primary DNS on the network. When the power goes out, DNS becomes unavailable. Everything loses power, of course. However, everything else comes back online faster, including my router, the AP, switches, computers, and the smart plugs. The server, on the other hand, takes several minutes to boot RAID, boot ESXi, and finally boot Windows Server and make the DNS available.

I’m assuming that when the network goes down, the computers maintain their DHCP lease information, including DNS settings. However, that didn’t seem to be the case with the smart plugs. They may keep their dynamic IPs, but DNS settings do not appear to stay. Not entirely sure what goes on.

So this was a perfect opportunity to attempt Conditional Forwarding on my Unifi Secure Gateway. Conditional forwarding, as the name suggests, allows for DNS requests to go to specific DNS servers depending on the request itself.

Why will this fix my problem? Because I have an AD domain on the network, which requires DNS. Some computers are on the domain, while other computers aren’t, along with all the IoT devices. But all use the same internal DNS servers, with the DNS settings being handed out via DHCP from the router.

I found some resources on how to do this and it’s relatively easy. I won’t go into the how-to, but I’ll share the guides:

The short of it is that I had to create a new JSON file called, config.gateway.json, with the following settings:

{
	"service": {
    	"dns": {
        	"forwarding": {
            	"options": [
                    "server=/home.jcphoenix.com/192.168.32.252",
                    "server=/home.jcphoenix.com/192.168.32.242",
                    "server=8.8.8.8",
                    "server=9.9.9.9"
                ]
            }
        }
    }
}

The first two options lines associate AD domain references with the internal DNS servers. The last two options denote that any other requests should go to Google Public DNS or Quad9, another public DNS.

After placing it on the Controller (“UniFi – Where is <unifi_base>? – Ubiquiti Support and Help Center“), and setting the “DHCP Name Server” in the Controller to “Auto,” I restarted the USG and tested it out.

Powershell Results of NSLookup

As you can see, when host name GRSRVDC01 was queried with nslookup, the result came back from the internal DNS server. Same with the FQDN of the AD domain. But when JCPhoenix.com—this website—was queried, it went outbound.

So mission success!

There were a couple other ways I could have fixed this. Buying a UPS for the server was probably the easiest. Which I still need to do. I also could have manually set DNS on domain computers, while letting the USG give out public DNS settings to the rest of the devices. But neither would have been as fun and also free.

I also don’t like using static network settings, aside from a device IP. Since this is an experimental homelab, some computers that are on the domain today might not be tomorrow and vice versa. I want systems to automatically receive necessary settings based on new or changing conditions or attributes.

The last thing I’ll mention is regarding uploading the config.gateway.json file. I host my Unifi Controller on an Ubuntu VM. So instead of using SSH to get in and upload the file, I simply dragged and dropped the file in to the correct folder. Unfortunately, finding the folder proved tougher than expected. Because the folder didn’t exist.

The trick to get the folder created was to go into the Controller UI, and upload an image of a floorplan. In the old UI, the path is:

Map > Floorplan (Topology dropdown top left) > Add New Floorplan > Choose Floorplan Image.

Any image will work, since the goal is simply to get the folder created. After that, the floorplan can be deleted, if desired.

That’s it for this round. I’m thinking that my next project will be to set up a VPN server to allow me to remote-in to the network when I’m away. Though we’ll see if I have the motivation anytime soon!

Homelab Chronicles 03 – I Need a UPS ASAP

The power went out recently in my neighborhood. Neighboring buildings were completely dark, as was mine. I was cooking dinner at the time, so not only was I hungry, but I was also in the dark.

And so was the server. Now I don’t host any crucial services on there. It’s a Homelab; it’s just for funsies. But I still need to get an uninterruptable power supply (UPS), at least to allow for graceful shutdown when these rare outages happen. Twice the power tried to come on minutes after the outage. That means power went out three times; two of those times, the server got power for just a moment before turning off again, since I have the machine set to automatically start after power failure. I don’t know what that does to a machine, but it can’t be good. Especially an old boy like mine.

That said, I don’t expect I’ll get a long-lasting UPS. The outage was long: 45 minutes. There’s no way I could keep a server going for that long on a UPS. At least one that I could afford. Plus, it’d be worthless to do so since everything else was unpowered: my computers, the router and switches, the fiber jack, etc. So I only need something that can last 10-15min. It’d also be nice if it the UPS had someway to trigger a shutdown of ESXi, but that might be asking too much.

I’ve researched this before, but I think I’ll get back on it. Maybe even a refurbished one is good enough.

On a side note, this will lead to my next task: setting up those Conditional DNS Forwarders I mentioned in my previous post. When the power did come back on, the router and Internet fiber jack came on quickly. But since DNS is on the server, and the server takes like 10 minutes total to boot, then for ESXi to boot, then the Window Server to boot, I didn’t have Internet during that time. First World Problem at home, sure, but in a business environment, that could be pretty annoying, especially if the issue is a server being down, while everything else is up.


Yes, that was my view above during the outage. Yes, those buildings had power, while I had none. I guess I live on the edge of a neighborhood grid. The buildings to the side and “behind” me had no power, while those in “front” of me did.

Honestly, it was kind of nice to sit in the darkness for 45min. I had my phone, so it wasn’t terrible. But I was still hungry.

Homelab Chronicles 02 – Admin Giveth and Taketh Away…the Domain Controller

One of my plans at work is to properly remove an older physical servers from the network. This server once functioned as the primary – and only – domain controller, DNS, fileserver, print server, VPN server, Exchange server, etc. It was replaced in 2018, but was never really offlined. It existed in limbo; sometimes on, sometimes off. During the pandemic, my “successor/predecessor” turned it back on so staff could VPN in to the office from home.

Long story short, it’s time to take it down. To start, I want to remove it’s DC role. But I’ve never done that before. I’ve added DCs, but never taken one out of the network. So that’s why I did this.

I started by creating a new Win2016 VM in ESXi. This would be my third Windows Server instance, and I named it appropriately: DC03.

I set a static IP and added the domain controller role to it via Server Manager. The installation went off without a hitch, so I completed the post-installation wizard and added it as a third domain controller. Again, no issues. In a command prompt, I used the command repadmin /replsummary to verify that links to the other two DCs were up and that replication was occurring. After that, I checked that DNS settings had replicated. All DNS entries were present, including the DNS Forwarders.

Wait, what?


In a moment of serendipity, I had a couple weeks prior created an impromptu experiment setup. I added DNS forwarders to DC01 after DC02 was added as a DC. I had seen guides and best practices saying that DNS settings either coming from a router via DHCP or statically put on a workstation shouldn’t mix internal and external servers. So DNS1 shouldn’t be an internal DNS server, while DNS2 points to a public DNS like Google’s 8.8.8.8. So that’s how I found out about DNS fowarders in Windows DNS mananger.

I expected the DNS forwarders to eventually replicate from DC01 to DC02, but they never did, even after multiple forced replications. At the time, I didn’t understand why that was the case. In the end, I manually added the forwarders to DC02.

And then a few days after that, I added another forwarder on DC01, but not to DC02. And of course, that last entry didn’t replicate, leaving a discrepancy.

Apparently, DNS forwarders are local only and they don’t replicate. Conditional forwarders will, but not full-on external forwarders. This has something to do with the fact that DCs in the real world may be in different geographical locations, with different ISPs, that require the use of separate external DNS forwarders at each location.

So imagine my surprise when DC03 automatically had the DNS forwarders that I had placed on DC01. But I quickly stumbled upon the answer:

By [adding DNS roles], the server automatically pulled the forwarders’ list from the original DNS servers, and it placed these settings in the new DNS server role. This behavior is by default and cannot be changed.

Self-Replicating DNS Forwarders Problems in Windows Server 2008/2012 | Petri IT Knowledgebase

That’s why DC03 had the DNS forwarders. When a new DC is added that has a DNS role, it will do a one-time pull from the other DNS server; in this case, my “main” DC. But after that, DC03’s forwarders will forever be local.

Case closed!


With the new DC03 in place, with its proper roles, I left it for 24hrs. Just to see if anything weird would happen.

And wouldn’t you know it, nothing weird happened. Sweet!

I ran nslookup on a few different computers on my network, including domain- and non-domain joined ones.

It looked like that on all the computers. All three DCs/DNSs were present.

After confirming that everything was OK, I started removing the newest DC from the environment. I attempted to remove the role via Server Manager, but was prompted to run dcpromo.exe first. Since it wasn’t the last DC, I made sure not to check the box asking if it was last DC in the domain. Once again, everything went smoothly.

To confirm that DC03 was no longer an actual DC, I did another nslookup on various computers. The IP address of DC03 was no longer showing. In addition, I checked DNS Manager on DC01 (and DC02) and saw that DC03 was no longer a nameserver. Though a static host (A) record was still present, as was a PTR in the reverse lookup zone; both expected results. I left the AD role on the server, but I could completely remove it if I wanted.

Pretty simple and straightforward.

This gave me the confidence to do this at work. Consequently, I removed the DC role from the old server last week with no issues whatsoever. No one even knows it happened. Which is all a sysadmin can ask for!

Homelab Chronicles 01 – The Beginning, Sorta

So this is a new thing I want to try. It’s been over a year since I’ve posted, so why not?

Over the last 12-18mo, I’ve had the opportunity to set up a Homelab. I worked at an MSP for almost a year and a half and got a bunch of old client equipment, including a couple Dell servers.

My lab isn’t really segregated from the main network, but that’s because of what I’m trying to do; I’ll explain soon. But before I get to that, here’s the main gear I’ve been playing with:

  • Ubiquiti Unifi Security Gateway (USG)
  • Cisco SG200-26 Managed Switch (24 port)
  • Ubiquiti U6-Lite AP
  • TP-Link TL-SG108E Managed Switch (8 port)
  • Dell PowerEdge T620

I also have a bunch of other gear, like a Dell PowerEdge R610 and another 16- or 24-port switch that are sitting around collecting dust. At one point, however, I was playing with Unraid on the R610. Also had a desktop PC that had pfSense or OPNsense functioning as my router/firewall, before getting the USG. I don’t know enough about firewalls to really use those though.

Anyway, here’s a crappy diagram of the network.

Things in red are the main devices. Not all devices shown; I think I have like 10 physical computers, though not all used regularly. And there a bunch of other WiFi and IoT devices. I included some of the extra devices like the PS4 and iPhone so it doesn’t look like I just have these extra network switches for no reason. I live in a 2-Bdr, 900 sq. ft. apartment, but the extra switches are so I don’t have 3+ cables running to a room that I’m tripping over (thank god for gaffing tape).

Initially, I was going to have a separate lab subnet and VLAN. And I started it that way. But I’m one of those that if I don’t have a real “goal,” it’s hard for me to just play around with things. I need an actual project to work on. It wasn’t enough to have a separate, clean sandbox. I wanted the sandbox that already had all the toys in it! So I’ve already redone the network environment once.

In the end, I decided that I’d create a Windows Active Directory Domain environment for home. I want to have a domain account that I use across my computers. Ideally, I’d have folder redirection, offline folders, and maybe even roaming profiles, so that any computer I use will have my files. The server(s) will also function as a fileserver, with network shares shared out to accounts via Group Policy.

On the network side, some of my goals are:

  • Stand up a VPN service, probably using WireGuard
  • Create a management VLAN and another for everything else
  • Set up conditional DNS forwarding
  • Replace the switches with Ubiquiti gear to really take advantage of the Unifi software

I could go on, but what I’m trying to emulate at home is a small business environment, from the bottom to the top, from the router all the way to the workstation. I work for a small biz, so this is the perfect place for me to mess around with and screw things up before I try on my employer’s live environment.

All in all, this is a great learning experience and I’m excited to share what I’m doing. Maybe this will help others who are trying to build their own Homelabs.

I know I’ll be screwing things up along the way – and I can’t wait to do so!