Sorry to be the "grumpy old man", but i don't get it.
In my experience, self hosting usually falls into two categories, people that "just want to host a simple static website", and "host all the things", and both are usually served better by using the cloud.
Most are in the category "i want to learn about self hosting", meaning they have (close to) zero experience, and while self hosting by itself isn't hard, maintaining a secure environment is, and that's where many people fail.
For the "simple static website", you can host it for free pretty much everywhere you like. Github pages, Azure static web apps, and countless others all offer stable, professional cloud servies for free, without the risk of exposing your network to the internet.
For the "host all the things", you see people attempting to mimic and entire data center at home, complete with monitoring, CI/CD and everything, and while i appreciate the learning experience, most people are blissfully unaware of the chore it is to maintain such a thing. Most of these services are better off being in the cloud.
- email is a likely thing people will want to self host, which is also perhaps the most stupid thing to do. First of all, it is a chore to keep your server off of various block lists, and you gain nothing but pain by self hosting it. Email is insecure by design. Every email has at least 2 parties, the sender and the receiver, and with >50% of the worlds recipients running on Google/Microsoft/Yahoo/whatever, your email will get indexed. A much better alternative is letting someone who knows what they're doing host your email, and use a personal domain. That way you can move your MX records if need be, still maintain your email address if changing provider, and let someone else deal with the problems of running the service. If it's privacy you want, use something else, or use encryption. In both cases, self hosting gives you nothing additional.
- cloud storage is another contender, and the most common "excuse" is that cloud hosting is too expensive, and yes, if you plan to store 200TB in the cloud then it is, but maybe instead you need to look at which files are needed when away from home, and use the cloud for those, and leave the rest at home, accessible by VPN if need be. If you need privacy with cloud file hosting, something like Cryptomator (https://cryptomator.org/) is much easier/better than maintaining your own server. (as a side note, you can get around 20TB of cloud storage for €20/month, or roughly the price of the electricity required to run a 4 drive NAS for the same time, but not including cost of hardware).
Not matter your setup at home, you will never create something as resilient as the major cloud datacenters. i.e. OneDrive (paid version) stores your files across multiple geographically separated data centers, using erasure coding, so if one data center dies, your files are still available in another center, and hastily being replicated to a third center. It uses atomic writes (like CoW filesystems, ZFS, Btrfs, APFS, etc) to ensure data written is correct, and has checksumming (inherent in the erasure coding), as well as versioning of files (OneDrive has unlimited file versioning for 30 days rolling), meaning you get at least some ransomware protection.
So in the end, most people are way better served by putting their stuff in the cloud and encrypting it, than they are exposing insecure services from home.
By all means, build the cluster at home as a learning experience, but save yourself some trouble and keep it within your LAN. If you need to access it externally, use a VPN instead. With modern VPNs like wireguard, there is very little overhead, and your data will thank you for it (as will your family as you suddenly have a lot more time to spend with them!).
Sending emails is hard because of the blocklists, but receiving (and storing) is easy. You can self-host the latter and delegate the former to a company specialized in it.
> Not matter your setup at home, you will never create something as resilient as the major cloud datacenters
That's true for hardware failures, but there are weekly stories on HN about major cloud providers randomly deleting people's accounts; so you need to setup off-site backups yourself either way.
> You can self-host the latter and delegate the former to a company specialized in it.
But what would you gain from it ? You gain no privacy, no improvement of service, no added resilience (probably less). All you gain is additional work maintaining your services.
> so you need to setup off-site backups yourself either way
You will always need backups regardless if self hosting or cloud hosting it, but by not opening up your firewall, you will have a much more secure home network, especially if you're not a security expert, which many people are not.
My personal mail backup consists of running a local dovecot instance that is synchronized every x hours using imapsync, and the dovecot mailstore is then backed up by my normal backup jobs every x hours. My provider does support an API for downloading a backup, but it's slower and more error prone than simply just synchronizing the email locally, and if need be, i can access my mailbox locally.
Restoring it in case of data loss or changing provider is also "easy", as i can simply reverse the imapsync.
1. You don't reply to many incoming emails, so they would never be seen by the MTA
2. Even when replying, you don't necessarily include the whole original email in your response (though that's only a very minor improvement)
3. MTAs normally don't store emails, and it would be expensive for them to do so as you aren't paying them for storage. This protects your past and present emails from the MTA turning evil (or being hacked) in the future.
> by not opening up your firewall, you will have a much more secure home network, especially if you're not a security expert, which many people are not.
Even consumer-grade routers support DMZs. With the right instructions, it's possible to only open the firewall to the server and keep it out of the home network.
> You don't reply to many incoming emails, so they would never be seen by the MTA
Every email has a sender and a receiver, and if either of those parties are on cloud hosted email, your email will be seen by an MTA, and you can bet your life that Google/Microsoft/whatever will snatch your recipient email and catalog it.
As i said, if you want privacy use something else, like Signal for instance, or use encryption, in which case it doesn't matter where you store your emails as any information you expose is already exposed by the protocol itself (sender/recipient/topic/date/source ip/destination ip/etc).
> MTAs normally don't store emails, and it would be expensive for them to do so as you aren't paying them for storage. This protects your past and present emails from the MTA turning evil (or being hacked) in the future.
So do backups, but with much lower maintenance, and more security/resilience, simply by being "offline".
> Even consumer-grade routers support DMZs. With the right instructions, it's possible to only open the firewall to the server and keep it out of the home network.
Ask your friends what a DMZ is. I'm certain that most people on HN will know what it is, and a large share will probably also know how to set it up, but then the issues with hairpin nat, name resolution, and other stuff starts cropping up, which is where some people just give up and instead expose it from the LAN so that they may access it from home as well.
Expand the scope to also include VLANs and you have an even smaller group.
Next up is that many people will happily use the same server for exposing services to the internet as well as internal stuff, so now you're just one CVE away from having everything on your server encrypted/deleted/leaked. You can partially mitigate that by using jails/containers, but thats another layer you need to familiarize yourself with, with the risk of once again getting it wrong.
Most people would be much better off just setting up a VPN and using that to access their home network, and letting professionals worry about securing services.
Edit: I should add that i'm not against people setting up servers at home for experimenting/learning, it's only when they expose those services to the internet it bothers me.
There is certainly value in experimenting with stuff in a homelab, but when there are so many free servies available that does stuff better than almost any reasonable homelab can hope to, there is very little point in accepting the additional risk.
In the case of a "static webpage" use case, you can publish that for free with GitHub, or you can chose to expose it from home, opening up your firewall, as well as ports to your server. Congratulations, you're now a network and system administrator, as well as responsible for maintaining SSL/TLS certificates, ensuring uptime (if the webpage has value, otherwise why bother publishing it in the first place).
In the "free" package you get resilient infrastructure with redundancy on every level (power/internet, hardware, software, services), you get a professional staff that babysits services, and you don't have to worry about anything expect creating the content you want to publish.
In my experience, self hosting usually falls into two categories, people that "just want to host a simple static website", and "host all the things", and both are usually served better by using the cloud.
Most are in the category "i want to learn about self hosting", meaning they have (close to) zero experience, and while self hosting by itself isn't hard, maintaining a secure environment is, and that's where many people fail.
For the "simple static website", you can host it for free pretty much everywhere you like. Github pages, Azure static web apps, and countless others all offer stable, professional cloud servies for free, without the risk of exposing your network to the internet.
For the "host all the things", you see people attempting to mimic and entire data center at home, complete with monitoring, CI/CD and everything, and while i appreciate the learning experience, most people are blissfully unaware of the chore it is to maintain such a thing. Most of these services are better off being in the cloud.
- email is a likely thing people will want to self host, which is also perhaps the most stupid thing to do. First of all, it is a chore to keep your server off of various block lists, and you gain nothing but pain by self hosting it. Email is insecure by design. Every email has at least 2 parties, the sender and the receiver, and with >50% of the worlds recipients running on Google/Microsoft/Yahoo/whatever, your email will get indexed. A much better alternative is letting someone who knows what they're doing host your email, and use a personal domain. That way you can move your MX records if need be, still maintain your email address if changing provider, and let someone else deal with the problems of running the service. If it's privacy you want, use something else, or use encryption. In both cases, self hosting gives you nothing additional.
- cloud storage is another contender, and the most common "excuse" is that cloud hosting is too expensive, and yes, if you plan to store 200TB in the cloud then it is, but maybe instead you need to look at which files are needed when away from home, and use the cloud for those, and leave the rest at home, accessible by VPN if need be. If you need privacy with cloud file hosting, something like Cryptomator (https://cryptomator.org/) is much easier/better than maintaining your own server. (as a side note, you can get around 20TB of cloud storage for €20/month, or roughly the price of the electricity required to run a 4 drive NAS for the same time, but not including cost of hardware).
Not matter your setup at home, you will never create something as resilient as the major cloud datacenters. i.e. OneDrive (paid version) stores your files across multiple geographically separated data centers, using erasure coding, so if one data center dies, your files are still available in another center, and hastily being replicated to a third center. It uses atomic writes (like CoW filesystems, ZFS, Btrfs, APFS, etc) to ensure data written is correct, and has checksumming (inherent in the erasure coding), as well as versioning of files (OneDrive has unlimited file versioning for 30 days rolling), meaning you get at least some ransomware protection.
So in the end, most people are way better served by putting their stuff in the cloud and encrypting it, than they are exposing insecure services from home.
By all means, build the cluster at home as a learning experience, but save yourself some trouble and keep it within your LAN. If you need to access it externally, use a VPN instead. With modern VPNs like wireguard, there is very little overhead, and your data will thank you for it (as will your family as you suddenly have a lot more time to spend with them!).