Hacker News new | past | comments | ask | show | jobs | submit login
Ansible-Defined Homelab (0xc45.com)
145 points by 0xC45 on July 25, 2020 | hide | past | favorite | 65 comments



Very cool read. Not least because that setup is very close to what I am doing right now, and the infrastructure I have (DS218+ and NUC represent).

Please help me understand your choices. You went for reproducible VMs on your fourth, custom machine. The three NUCs are compute-only (stateless?) nodes for a k8s cluster.

What kept your from making your custom-machine a node as well? That gets you Infrastructure-as-Code too, since your Nextcloud (what I am setting up right now, too) and whatnot are defined in code, using containers.

By the way, since my set-up is a single NUC, I ended up going for docker-compose. That has just plainly worked. Single-node k8s with outside access (only need HTTP/HTTPS for Nextcloud and the like, with reverse-proxying through subdomains for more apps, like CodiMD) didn't prove to be friendly to me (not my field of expertise). I tried had tried multiple CNIs, traefik and contour as IngressControllers, and MetalLB as a LoadBalancer, but could not get it to work.

And even if it ended up working, it would have to be a stateful k8s cluster, and I imagine that adds a whole layer of problems, too. Do you then go for NFS to the Synology NAS? Seems like the best idea, but that will be much slower than the nodes' local SSDs. Do the SSDs then sit idly, without work to do? Seems like a waste. If the deployments/PVCs/PVs are tainted such that they are bound to specific nodes, you don't have much of a cluster, more like a spread-out docker-compose (which still beats docker-compose, I guess, if it gets running).

These are the questions I couldn't answer, or whose answer drove me to docker-compose.

Since k8s has so much more steam than docker-compose, I imagine in 1-2 years, k8s will be the go-to for single-node homelabs as well. What do you think?


Hey, thanks for all the good ideas.

About the custom "VM-only" machine: currently I'm a little lacking in my "k8s ops" skills. So, until I am more comfortable managing the k8s environment, I have decided to continue using VMs for my "production" workloads. Until then, the K8s cluster will be treated like more of an experimental zone and I can worry less about breaking things.

As for the K8s storage options -- I don't really know! Haven't quite figured that one out. Like I mentioned in the blog post, I haven't really placed any serious workload on the K8s cluster yet. However, I have heard good things about OpenEBS (https://openebs.io/) and am considering that as a potential option in order to make use of the local SSDs on each NUC.

For single-node K8s, there are lots of good options. I haven't used it personally, but I know that MicroK8s (https://microk8s.io/) works well even in a single-node environment. It might be worth checking out.


I'm running around 90% of the things in my homelab in a kubernetes cluster. I love it, but storage has been the biggest challenge. From my experience, OpenEBS and Rook both seem to be good options when you have multiple physical machines with disks on each one. Portworx is also very good, but the licensing costs are way too high.

I'm running my kubernetes nodes as VMs and I didn't want to have to backup the virtual machine disks. I just wanted to back up my freenas server, so I went with trying to use NFS.

I wanted to use the NFS client provisioner, as I thought I'd want to be able to add and remove volumes automatically, however, this makes redeploying onto a new cluster from scratch hard. Eventually, I just tore that out and just mounted each NFS volume individually because it turns out that I actually only need a handful of volumes.


Maybe useful information, I'm not sure: but I was at DellEMC world in 2017 and they showed off this really cool k8s storage tech. It was called RexRay and was obviously being demo'd with ScaleIO in mind.

However I see that it has multiple backends (including CEPH), might be worth looking into as it seemed very fault tolerant and fast. :)

https://github.com/rexray/rexray


For simple storage, rancher longhorn is also an option.


+1 for Longhorn. I started using it in my homelab as an easy replicated/distributed storage.

It was a lot easier to setup and had more of an "it just works" out of the box experience compared to openebs/rook/etc.


You can use an NFS provisioner so volumes are set up automatically over NFS, no need to set up taints.

https://github.com/kubernetes-incubator/external-storage/tre...

MetalLB is dead simple to set up, and you can hook it up with a DNS server with TSIG to get automatic name entries for your ingress hosts.


> k8s will be the go-to for single-node homelabs as well. What do you think?

Not gonna happen because it takes 3 master nodes to have a quorum and 1 separate worker node. The requirements are also pretty high with 2GB of memory per virtual machine. Don't think people have 4 extra machines or 10 GB of free memory to run VMs.


> Not gonna happen because it takes 3 master nodes to have a quorum and 1 separate worker node.

You don't, strictly speaking, need separate worker nodes. You just need to untaint the masters.

Or you can just run the masters on raspberry pis - under $200 for a properly redundant cluster. (Masters and workers do not have to share a CPU architecture)


Ok but I'm still hung up on the 2GB.

Management processes shouldn't take 2GB. What are they doing in there?


Kubernetes is actually using the memory. Just checked one master node, fresh cluster running nothing, it is sitting at 950 MB used with another 800 MB in buffers. Usage will jump quite a bit higher when running anything.

Very important thing to know about memory: Kubernetes nodes have no swap. Kubernetes will refuse to install if the system has a swap, gotta remove it.

This means nodes better have a safe margin of memory, because there is no swap to use when under memory pressure (things will crash). Hence the minimum requirement of 2 GB.

I tried to run complete clusters with 5+ machines in VmWare, with as little memory as possible because I don't have that much ram on my desktop, and all I got was virtual machines crashing under memory pressure.


You could get older server hardware pretty cheap, and just run everything on a single physical machine, no?

For example, I picked up about 96 GB of DDR3 ECC ram for around 75 euros. A quick check on ebay, and the same amount of DDR4 is selling for at least _twice_ as much. I imagine it's pretty economical to buy this older hardware and just assemble a single beefy server, instead of buying multiple physical machines.

The added benefit is that this older hardware doesn't end up in a landfill, and even though older CPUs generally consume more power than their current gen equivalent, I reckon a single machine would consume about the same, or less, than multiple NUCs (I have no source to back that up though, it's just my assumption).


Server hardware is designed to run in server environments, not home environments. Maybe if you had a basement or something, but the noise on those fans is going to drive you nuts. Off the shelf home desktop hardware is better, but amazingly inconsistent. The NUC platform isn’t a bad way to go if you have the cash.


You're absolutely correct about the noise. Those 1U servers sound like a jet taking off pretty much the entire time they run. However, there's plenty of motherboards that are ATX (or E-ATX), so they fit in regular cases with little to no modifications. (Though, I just keep mine in my attic)

I agree the NUC is a great platform, but if you could spend less cash and get more bang for your buck, and perhaps have the added benefit of having a platform with ECC memory (not sure if the NUC supports ECC, I'm assuming it doesn't), then I think the latter is what most people would go for (or well, at least what I would go for :p).

We're also talking about home _servers_, so it doesn't seem that odd to me to use actual server hardware. The homelab[0] subreddit has a bunch of folks running actual server hardware for example.

[0]: https://old.reddit.com/r/homelab/


Or drive your family nuts, who will take you with them.

I had plans to build a noise isolated data closet in the basement (tied into the furnace air return) but I never ended up with the right sort of basement.


A closet or the garage also works. I ran a couple of Dell 1U's in a closet for years, you could barely hear them thru the (fullsize, non-sliding) door


Sure, but desktop hardware is also more likely to fail quicker as they're not designed to be used like servers. So there's a trade off.


I use Dell/HP/Lenovo workstations that are off lease for this very purpose. Dual CPU, large RAM capacity, and quiet.


He wants the fun homelab not savings. His particular NUCs are ~$600 a piece, the nas was a few $k as well. Definitely you can get more compute on a single node but I think you'd be missing the point.


I finished setting up my Homelab Kubernetes setup this week. It's running two K3s[0] (a lightweight kubernetes) "clusters". One single node on a HP Microserver for media, storage, etc and one multi-node on Raspberry Pi's for IoT. For my homelab I don't need pretty autoscaling, so just a single node is enough. For the rpi's I'll probably try K3s HA cluster as they can be unreliable when the SD card eventually fails.

Previously I used hand-crafted, Ansible, Salt, Puppet and Docker setups (that last one died two weeks ago prompting me to start the K3s setup). But they all ended up becomming snowflakes, making them to much hassle to maintain. What I like best about the K3s setup is that I can just flash K3OS[1] to a SD card with near zero configuration and just apply the Kubernetes resources (through Terraform in this case, but Helm is also a great timesaver).

I still have to figure out a nice way to do persistent storage. But for now since it's one node (the IoT cluster does not have state) the local-path Persistent Volumes work well enough. Might have a look into Rook.

I will admit it's not trivial to get started with Kubernetes, but since I already needed to study it for work this provides me with a nice training ground. Alternatively Hashicorp offers nice solutions in this space as well with Consul and Nomad. But it needs a little bit more assembly, whereas K3s comes with batteries included (build in Traefik reverse proxy, load balancer and DNS).

[0] https://github.com/rancher/k3s#what-is-this

[1] https://github.com/rancher/k3os


> Not gonna happen because it takes 3 master nodes to have a quorum and 1 separate worker node.

Can you explain that more? I had a single-node cluster running: the master needs to be untainted so work gets scheduled on it, and then it runs. There is no need for a quorum because there is nothing to decide on.


Assuming one is doing a kubernetes homelab to train for work or future job interviews, they should probably get a full cluster going.

Kubernetes depends on etcd which requires an odd numbers of instances. There might be ways to run a one or two nodes cluster for testing (see minikube) but that's not how it's going to run in a company.


You can run a one-node etcd cluster for testing, but it's not recommended for any real usage.

Most companies won't be running their own etcd clusters anyway, except for on-prem clusters. Cloud users will generally use a managed Kubernetes cluster. If you do need to learn how to run etcd, you can learn the basics in about a day. In our org, the etcd portion of the training is less than half an hour long thanks to good documentation.

For a homelab, consider running something like k3s which can use a SQL database instead of etcd.


I've seen plenty of single-node K8s setups that are actively used and maintained. If you're more after the IaC and K8s API and less concerned about raw HA, then it's a workable solution if you have enough RAM and CPU.

Some Kubernetes-based distributions, like OpenShift, require 3 masters at minimum, but that's not the norm across K8s installations.


Nice read.

About building your own NAS to replace the synology, take it from someone who has done this, don’t bother.

While rather easy to do, Synology provides so much more “out of the box”. My own NAS ran well on Debian 10 for a few years, until I started getting random disconnects on drives and various other “stability” issues. Synology costs more up front, but after initial configuration it’s more or less just a box in the corner you forget about. Mine just sits there, automatically installing software updates whenever they’re available.

As for my own current setup, I run my hypervisor (Proxmox) on a Dell PowerEdge T30. It runs an internal docker host (Debian 10) and an external FreeBSD host on a DMZ vlan for anything that is accessible from the internet. Everything on that box runs in its own jail.

All storage except OS storage is mounted from the Synology via kerberized NFSv4.

Firewall is handled by a Netgate SG-3100 running Pfsense, but I’m in the process of migrating to a UniFi Dream Machine instead. Pfsense has been good to me, but the SG-3100 is expensive (for what it delivers), and its beginning to show its age. I have a 300/300 mbit connection, and with Suricata enabled I get frequent reboots because it uses too much CPU, causing the watchdog to think its stuck. The UDM is half price and twice the hardware, and for my usage (router/firewall/vpn/dns blocking) the UDM does it all (Pi-hole through a 3rd party solution).


While rather easy to do, Synology provides so much more “out of the box”. My own NAS ran well on Debian 10 for a few years, until I started getting random disconnects on drives and various other “stability” issues. Synology costs more up front, but after initial configuration it’s more or less just a box in the corner you forget about. Mine just sits there, automatically installing software updates whenever they’re available.

Does your synology utilize BTRFS? If so, what is your experience with it? I'm currently using a FreeNAS box that runs on ZFS that has been very reliable for me over the years. I can't really see giving up the power/reliability of ZFS to move to a synology setup, but I like to keep tabs on BTRFS progress.


Not op but I have several Synology NASen using Btrfs volumes with no issues that I can recall in the past six years. I’ve replaced all of the drives twice to increase available storage and that’s it. The two I have at home are both set to autoupdate and do monthly disk scrubbing—they automatically email after scrubbing with any issues. They’re really pretty hands off devices overall.


Thanks for the advice about the NAS. Perhaps it's best to leave critical data and backups to a professional product.

As for the firewall, I had a Ubiquiti Security Gateway for a while and it worked great. Not sure how it compares to the UniFi Dream Machine, however. The only reason I made a custom firewall is because I wanted to try and learn a bit about firewalls and routing in general. And, since my own device has worked well for me so far, I haven't bothered to replace it.


Be careful, I made a switch from OPNsense to the UDM recently and it’s pretty half baked. Some menus crash 100% of the time, some features are missing randomly (you can’t use the UDM as a DNS server) Just weird little things. They are in the middle of a big transition, so I’m guessing it has to do with that.


Yeah I'm right here as well. I built my own NAS first with FreeNAS and had enough issues with it that I eventually caved and I'm now three Synologys deep into my home lab environment, couldn't be happier.


Running FreeNAS gives you a lot more flexibility and can be just as easy to maintain.


Why proxmox over ESXi?


Not the GP, but as an ESXi user considering converting to Proxmox, there are a number of limitations on the free ESXi that can be annoying, largely around the non-standard and generally poorly supported tooling required to manage the platform unless you have a paid vCentre instance.

I can't easily create new VMs, create snapshotted/linked clones, and I can't easily pull out CPU/RAM/disk utilisation information from ESXi for the hypervisor and all VMs to store in a time series database for monitoring/alerting purposes.

There's also a tonne of additional features in Proxmox that requires additional licensing for ESXi (and yes, I am aware of the VMUG EE licensing making a lot of that a lot more affordable)

Proxmox ultimately being Debian based gives a lot more flexibility in that regard. It's also a potential weakness as a result, depending on perspective.


Thanks appreciate the response. Wasn’t aware of proxmox gonna do some more research!


I'm not sure if it's just because of a change in my own mental filter or due to an actual increase in this type of content, but it feels like I've started noticing more posts and general information about homelabs showing up in my various feeds recently. If that's the case (for either an actual increase or a change in my mental filter), I wonder if it's related to the ongoing pandemic.

Maybe since people are suddenly spending more time at home, there are certain resources that they'd normally have access to at their work that they don't have access to anymore and need/want to create themselves, or are suddenly more reliant on their own infrastructure, or may just spending more time at home leads to interacting with their home set-up more often which leads to more tinkering with it. I know I've put much more effort into working on my home dev set-up this year than I did last year.


Personally at least it's being at home more motivated me to improve the setup. Also just a growing distrust of US based services inspiring me to host more myself.


Is there so much need to use VMs? Unless you plan to keep Windows or some exotic isolated desktop stuff I think there is a little value from using VMs. LXC/Docker/Jails should satisfy most of self-hosting homelab needs.

I got 4 bay Synology NAS with Intel CPU & 8GB RAM which nicely runs Docker, some reddit users boost it even up to 16 and 32GB RAM. Plus OpenWRT router. Entire "rig" eats like 40Watts.


+1 for LXC/Docker. Migrated off of Proxmox which, although wonderful, was more admin overhead than IAC scripting. Want to check out K3s though.


Out of curiosity, why are you running 2 k8s nodes on a single physical machine? Why not just run k8s on the physical machine? K8s should be able to manage all the processes correctly for you, and in addition you won't have the resource overheads of 2 VMs.

The only reason I can see having a VM as an abstraction layer is if your hypervisor runs in a cluster and it's able to move volumes across nodes, but even then, it feels like having a single VM running on the nodes that take up the full physical resources makes more sense.


Haha I asked the exact same question at the same time. My own speculation is that it might be convenient if you need to reboot the VM for some reasons?


Or maybe half of the nodes are master nodes only, which avoid running workers on the same node as master, probably a good idea


Hey, I like this idea! Actually, I only have a single node running the control plane. And, to answer the original question, it felt like overkill to dedicate an entire NUC to running just the control plane. So, I decided to run two k8s nodes on each NUC. I guess there's potentially a bit of a performance hit due to the extra overhead of running multiple kubelets, but it hopefully won't matter too much in practice.


How about expanding with 3 SBCs for your master noses? The odroid C4 is a good candidate for this at 50$ each. Or rock pi 4 or PCEngines APU if you want to spend extra on NVMe or SATA storage instead of eMMC. (Raspberry Pi omitted because, at least from my findings, there’s no attractive USB root storage in SFF so if you want performance and reliability you eventually end up with USB-to-SATA and 2.5” drives, or USB NVME enclosures)


I really like projects like this since it feels true to the vision of the Internet as a distributed network of peer nodes, instead of the modern incarnation of centralized servers vs. consumer clients. Email was notably absent which makes sense given the challenges of self-hosting a mail server [1] but I hope someday that too will become easier.

[1]: https://news.ycombinator.com/item?id=22789401


"Ansible-Defined Homelab" describes exactly what HomelabOS does. https://homelabos.com/


I love stuff like this, I’ve learned so much over the years doing things like this outside of the work environment where everything needs a reason e.g. https://github.com/CraigJPerry/home-network - junk hacking for fun

Prioritising time for this kind of thing seems to go in cycles for me. E.g. i remember spending a few weeks back around 2005 setting up a home lab “just so”. I still have the notebook i wrote up at the time. Loads of knowledge in there i’ve since forgotten about setting up lvm on aix hosts with smitty and pinouts for converting cat-5 & db9’s into null modem cables for managing cisco ios on old switches. I can barely remember config t and enable mode these days...

There’s a real satisfaction to be had from this kind of thing. Serial consoles on everything, jumpstart configs dialled in just right to be able to rebuild any node on the home network hands off.

Then it was around 2013 the next time i got an urge (the ansible repo above) then most recently it’s been k8s experiments. Interesting it’s about 7 year cycles.


I like this and, eventually, want to get my homelab/homeprod stuff to that level. The only thing that seems to be missing, which is an unfortunate limitation of the free ESXi, is the ability to create the VMs.

My ideal solution would be to create configuration for a VM and its applications and run a command to have the VM created, OS installed, applications installed, etc. Taking it a step further, being able to regenerate my reverse-proxy configuration, certificates, and DNS configuration would make lighting up new services incredibly easy.

Right now I'm running my VMs on FreeBSD and managing them with vm-bhyve. There are possibilities for automation here, and I've done a little bit of it (at work) using Kickstart to install CentOS VMs and shell scripts to do the rest. Unfortunately, this is all very purpose-built - if I were to do it again, I'd probably scrap it in favour of Ansible or something similar.

Obviously I've got a long way to go to get to my ideal.


You could definitely achieve this with a whole plethora of tools like Terraform, packer and some kind of CI. But at some point you have to draw a line and ask what is ‘good enough’ for homelab.

For me, that line is drawn at hand installation of a particular OS and then conversion into a template. From then on, I can reference that template in Terraform or Ansible automation and clone it before customising it to the requirements I have that day.

The only requirement from using these tools against VMware is a vSphere instance. I’m sure if you look hard enough you can find that quite cost effectively on eBay.

And just remember that your infra is never finished. It’s a journey! And the journey is often more fun than the destination! Happy labbing friend.


For me, a lot of this is driven by wanting to know how things work.

For example, I can go to Digital Ocean or Oracle Cloud or Azure or AWS and get a VM provisioned in a minute or so. I know some of the tools involved, like cloud-init, but I'd really like to know how the rest of it works and to see it in action.

Honestly though, the number of times I need to build a new VM is very small. The time savings that come with each individual VM build will never recoup the time spent building out the pieces required to make it happen. But knowing how that all works will be worth it.


Yeah, unfortunately the free ESXi makes it difficult to "programmatically" create VMs. However, I decided that it was an acceptable trade-off for now. It would be super cool to run a script with "nothing" running on your network and have it set up everything from the ground up. Perhaps I'll manage to get there someday as well.


I usually use "lab" to refer to the opposite of production. My lab is a space where I can change anything at any time and no users are going to be affected. But it seems like this article uses "lab" to mean "my critical services."


Yeah, I guess it's somewhat "half-and-half" currently. The k8s cluster is definitely an experimental zone. The spare cycles on the main ESXi host are also available for testing out ideas, etc. However, I guess a benefit of having regular backups and an easy way to re-provision the services with ansible means that the risk (and cost) of breaking something is fairly low.


@0xC45: as a fellow OpenBSD user you might find my Ansible roles useful.

I also run OpenBSD HA firewall clusters with trunking (aggr), VLANs at home and work:

- https://github.com/liv-io/ansible-roles-bsd/tree/master/host... # interface configuration

- https://github.com/liv-io/ansible-roles-bsd/tree/master/open... # firewall configuration



Genuine question: what's the benefit of having 2 nodes per physical machine, vs having a bigger node on each machine?


You can only run a certain number of pods per node before the kernel craps out. Depending on your K8s version somewhere in the 250-500 range.

Admittedly, this is unlikely to be an issue at home lab levels. Therefore one must suspect the “because I can” methodology is at play here.


Isn’t the official limit 150 for vanilla K8s? Google defaults to a 30 pod limit iirc.


It felt like overkill to dedicate an entire NUC to running the K8s control plane. So, I split each NUC in half. Only one node is running the control plane (and, I have prevented any other workloads from running on the control node as a best practice). The other 5 nodes are available for running whatever workloads. There's probably a bit of a performance hit from running two separate OS's and kubelets on each NUC, but I'm hoping it won't matter too much in practice.


Very cool. The only thing is that Kubernetes at home sounds very depressing to me.


I thought this was going to be like https://github.com/rycee/home-manager but it is more system-wide services.


k8s at home. Some people just love pain.


I really have to disagree, if you're going the route of hosting stuff in your home, then by all means take an easy k8s distro like rancher. I have stuff running from openhab, Nextcloud to even my Unifi controller. It runs flawlessly; and together with Rook, I don't have to be afraid of node and disk failures. I haven't learned so much in 2 years since I started linux 20y ago.

Personally I think the amount of pain wasn't enough, so I'm using Canonical Maas, and I can just spin up extra servers, though going beyond 4 nodes seems kind of wasteful in a home setting. Though my rig seeps 100Watt, which for hobby purposes is okayish imo. I've been running this since Ubuntu 16.04 and upgrading to 18.04 has been a relative breeze compared to a situation where I would've built this using adhoc scripts(i.e. non-k8s setup.)

I did make a wireguard tunnel to a colocation datacenter for offsite backups using async mirroring of rook/ceph. B/c with so many moving parts and automation, wiping a cluster accidentally is quite possible.


Haha. I'm a little bit lacking in my "k8s ops" skills and hope to learn by practicing! Currently there isn't any serious "production" workload running on the k8s cluster. Perhaps if I become more comfortable with managing the k8s cluster, I will migrate my other services onto k8s. We'll see!


I had the same thought. I myself used LXD for my personal server and I can't regret enough.

Not that I dislike LXD, but a good configured server won't require many interventions. In my case, I was about to upgrade some guest SO, and I had to relearn lots of things and remember what I did, even why, after about two years without touching it.


After a couple of times in my life with similar pains, I’m now very disciplined with having everything sans data and secrets in git repos with Ansible playbooks and Terraform templates.

Even if it’s just for me I still try to write reasonable commit messages and all that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: