Hacker News new | past | comments | ask | show | jobs | submit login

Namespaces handle most of these issues. A NetworkPolicy can prevent pods within a namespace from initiating or receiving connections from other namespaces, forcing all traffic through an egress gateway (which can have a well-known IP address, but you probably want mTLS which the ingress gateway on the other side can validate; Istio automates this and I believe comes set up for free in GKE.) Namespaces also isolate pods from the control plane; just run the pod with a service account that is missing the permissions that worry you, or prevent communication with the API server.

GKE has the ability to run pods with gVisor, which prevents the pod from communicating with the host kernel, even maliciously. (I think they call these sandboxes.)

The only reason to use multiple clusters is if you want CPU isolation without the drawbacks of cgroups limits (i.e., awful 99%-ile latency when an app is being throttled), or you suspect bugs in the Linux kernel, gVisor, or the CNI. (Remember that you're in the cloud, and someone can easily have a hypervisor 0-day, and then you have no isolation from untrusted workloads.)

Cluster-scoped (non-namespaced) resources are also a problem, though not too prevalent.

Overall, the biggest problem I see with using multiple clusters is that you end up wasting a lot of resources because you can't pack pods as efficiently.




Aware of all of this, but we have a need to run things relatively identically in GKE/EKS/AKS and gVisor can't be run in EKS, for example.

We're okay with the waste as long as our software & deployment practices can treat any hosted Kubernetes service as essentially the same.



For those that didn't click through, I believe the parent is demonstrating that it is a best practice to have many clusters for a variety of reasons such as: "Create one cluster per project to reduce the risk of project-level configurations"


For robust configuration yes. However one can certainly collapse/shrink if having multiple clusters is going to be a burden cost-wise and operation-wise. This best practices was modeled based on the most robust architecture.


This is it exactly.

Thank you.


Namespace are not always well suited to hermetically isolate workloads.


It's probably not worth $75/month to prevent developer A's pod from interfering with developer B's pod due to an exploit in gVisor, the linux kernel, the hypervisor, or the CPU microcode. Those exploits do exist (remember Spectre and Meltdown), but probably aren't relevant to 99% of workloads.

Ultimately, all isolation has its limits. Traditional VMs suffer from hypervisor exploits. Dedicated machines suffer from network-level exploits (network card firmware bugs, ARP floods, malicious BGP "misconfigurations"), etc. You can spend an infinite amount of money while still not bringing the risk to zero, so you have to deploy your resources wisely.

Engineering is about balancing cost and benefit. It's not worth paying a team of CPU engineers to develop a new CPU for you because you're worried about Apache interfering with MySQL; the benefit is near-zero and the cost is astronomical. Similarly, it doesn't make sense to run the two applications in two separate Kubernetes clusters. It's going to cost you thousands of dollars a month in wasted CPUs sitting around, control plane costs, and management, while only protecting you against the very rare case of someone compromising Apache because they found a bug in MySQL that lets them escape the sandbox.

Meanwhile, people are sitting around writing IP whitelists for separate virtual machines because they haven't bothered to read the documentation for Istio or Linkerd which they get for free and actually adds security, observability, and protection against misconfiguration.

Everyone on Hacker News is that 1% with an uncommon workload and an unlimited budget, but 99% of people are going to have a more enjoyable experience by just sharing a pool of machines and enforcing policy at the Kubernetes level.


It doesn't have to be malicious. File Descriptors aren't part of the isolation offered by cgroups, a misconfigured pod can exhaust FDs on the entire underlying Node and severely impact all other pods running on that node. Network isn't isolated either. You can saturate the network on a node by downloading large amount of data from maybe GCS/S3 and impact all pods on the node.

I agree with most things you’ve said around gVisor providing sufficient security, but it's not just about security, noisy neighbors are a big issue in large clusters.


IOPS and disk bandwidth aren't currently well protected either.


RLIMIT_NOFILE seems to limit FDs, or am i missing something?


CRDs can’t be safely namespaced atm, aiui.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: