Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: Infra (YC W21) – Open-source access management for Kubernetes (github.com/infrahq)
159 points by jmorgan on May 17, 2022 | hide | past | favorite | 58 comments
Hey HN! We’re Jeff and Michael, and we’re building Infra (https://github.com/infrahq/infra). Infra is a tool for managing access to cloud infrastructure. We’re starting with Kubernetes and have a roadmap to support Postgres, SSH and much more.

Michael and I were the co-founders of Kitematic, an easy way to run Docker on the desktop. We sold the company to Docker and built Docker Desktop while there. After that, we worked on Infra App, a Kubernetes client for Mac, Windows and Linux. Between what users told us and our time at Docker, it became obvious that managing infrastructure access was becoming increasingly painful.

Many larger teams don’t give access to developers, while smaller teams often just grant admin access to everyone. Teams in between either build extensive tooling in-house (e.g. Segment’s Access Service), or they end up spending a lot of time manually onboarding and offboarding team members with the right permissions. We wanted to help teams securely distribute access using the principles of least privilege to their infrastructure systems without managing certificates, keys or integrations with identity providers.

With Infra, access is granted or revoked via an API or CLI, and in the background, Infra takes care of provisioning users & groups with the right permissions no matter where the cluster is hosted (EKS, GKE, AKS, or other managed/self-hosted Kubernetes clusters). When users need access, Infra distributes short-lived credentials that expire after a short period. For larger teams, Infra integrates with identity providers like Okta to automatically give access via existing accounts.

Credentials are signed and verified by a central root of trust with a short time to live, so they are easily revoked or rotated when necessary. Infra doesn’t rely on a single point of failure. Other tools in this space use a centralized proxy to verify credentials, whereas Infra instead verifies them at the destination infrastructure. Access continues to work should Infra’s API or the configured identity provider go down temporarily. For clusters hosted in different regions, this means users won’t suffer from slow connections from being proxied.

Infra is a lightweight service written in Go, uses <100MB memory at rest, and is deployed by default with SQLite.

There are a few existing tools that solve infrastructure access management, but they are sold directly to directors of engineering or security teams and can’t be easily deployed by smaller teams without an expensive sales contract. We set out to build a product that teams of any size can pick up, self-host, deploy and build custom tooling on top of without fretting about a sales conversation.

Our GitHub repo is at https://github.com/infrahq/infra which contains the full product that can be self-hosted via Docker or Kubernetes. We plan to make money by running a managed service version of Infra so teams don’t need to host and upgrade Infra manually. We don’t have pricing for this yet but will charge in a way that scales with usage, so even smaller teams (<10) can use it.

Our team includes an early VMware employee whose work is used in all VMware ESXi installs, an engineer from Hashicorp who was a large contributor to Consul, the original developer evangelist from Datadog, and the engineer who built 1Password’s cloud service, 1Password for teams.

We started building Infra a year ago and have been quietly iterating on it with a few teams of various sizes, ranging from 5 developers to public companies. We’re so happy to be able to share it with you and can’t wait to hear your feedback and thoughts!




I work at a company that provides a SaaS platform for running Kubernetes clusters across edge, private and public clouds, and IAM for cloud infrastructure is a massive issue. My team and I have spent significant amounts of time digging into this issue and connected with infra over 12 months ago. This as a two layer problem; provide consistency across clouds for DevOps and platform teams, and ease of use for users.

Infra is solving this very complex issue by addressing both and I like the approach the team has taken.

Simplifying IAM for platform teams whilst importantly maintaining native controls such as RBAC with multiple clusters using a single distribution of Kubernetes is crucial. Any layer on top of Kubernetes RBAC makes it impossible to remain open and portable. Solving this across different clouds, be it the hyperscale providers or any on-prem DIY, is even more complex, OIDC is one example.

Further, issues arise when you want to provide developers self-service access to clusters. Current in-market options are limited or require separate tools for separate clouds, AWS IAM for example, or result in further in-house/DIY development.

Can't wait to see where this goes next.


Why should someone use this instead of Vault, Teleport, or other PAM tools? Does it work with managed Kubernetes offerings like AKS/EKS/GKE?


Thanks for the questions! Vault doesn't have a deep integration to generate credentials for Kubernetes, and Infra plugs in to users' tooling (e.g. kubectl and Kubeconfig) to keep credentials up to date automatically.

Infra's different than Teleport in a few ways. Teleport doesn't provide identity provider integrations beyond GitHub (e.g. Okta) in their open source project. They have a different architecture that involves deploying a centralized proxy service (whereas Infra verifies credentials at the destination infrastructure vs at a central proxy). Further, we've designed Infra around an extensible REST API from the start whereas Teleport uses GRPC.

Infra does work with managed Kubernetes services like AKS/EKS/GKE! It will also work with self-hosted Kubernetes clusters.


Sasha, CTO@Teleport here. Congrats on the launch!

RE: Teleport design

Teleport does not require a centralized proxy, because it is based on certificate authorities. You can issue a certificate with or without Teleport proxy and access any cluster that trusts that certificate directly.

Because of this design you can have a completely decentralized system, with cold storage for your CA, HSM or any parallel system issuing certificates. There is also no need to revoke your credentials, because your certs are short-lived and bound to the device and cluster, so there is less opportunity for pivot attacks.

RE: GRPC

First version of Teleport also had HTTP/JSON REST API, but we have migrated to GRPC to support events streaming and have one type system across multiple languages and services boundaries.

Re: Managed clusters

Teleport supports all CNCF-compatible clusters, including AKS, EKS and GKE out of the box.


Great point on GRPC having better support for event streaming! We originally built Infra to have a GRPC API, but many users we spoke to didn't yet have load balancers or ingress controllers that supported the GRPC protocol (e.g. one user had to consider upgrading their AWS Load Balancer controller to put Infra behind it).

We wanted to remove as many hurdles as possible for teams to deploy Infra in their environments. Event streaming will invariably become an important part of the API (e.g. for features like audit logs), and we'll consider GRPC again for internal components of Infra.

RE using Teleport without the proxy, how would a target cluster's Kubernetes API server (e.g. an EKS cluster) verify certificates without Teleport's proxy?


> one user had to consider upgrading their AWS Load Balancer controller to put Infra behind it

Huh?

The AWS load balancer for which gRPC is relevant is their Application Load Balancer (ALB), which would require you to terminate TLS at the ALB and does not support mutual TLS (which is how short-lifetime client certificates work in this case). To the best of my knowledge, you can't pass through a client-key-encrypted gRPC session through an ALB (maybe I'm wrong?).

Typically this requires an NLB, which will treat all TCP traffic (REST and gRPC) the same, so gRPC wouldn't require an upgrade?


Re: GRPC

My bet that you'd migrate to GRPC eventually as you scale :) I like the simplicity of HTTPS/JSON API as well, but it just broke down for us at a certain scale point.

Re: Teleport with EKS

True, CNCF clusters support mTLS out of the box, but EKS hides the endpoint and does not let you provision CA to trust. You will have to run teleport proxy inside the EKS cluster to translate mTLS to EKS IAM auth. However, you don't have to have a centralized proxy, you can just deploy Teleport proxy agent in each cluster and hide your K8s endpoint.

You also don't have to have a single Teleport proxy to do that.


Thanks! Curious, where did HTTP+JSON break down for you? Was it specifically around audit/event streaming? This would be helpful as we consider building out future updates to Infra, especially considering tools like Kubernetes have put HTTP+JSON APIs the test (at least in their user-facing APIs)

Indeed! EKS + others don't allow custom authentication methods or allow you to use an external CA for the cluster. Running a proxy agent in each cluster makes sense and is similar to how Infra approaches it: I hadn't seen that configuration in your architecture pages!

Have you considered distributing certificates signed by the cluster CA itself (to avoid proxies altogether)? In 1.22 onwards there's a new ExpirationSeconds field when creating a certificate signing request: https://github.com/kubernetes/enhancements/issues/2784 . I imagine this will be supported by all the hosted Kubernetes services - we've been watching this closely.


this looks like a centralized proxy to me: https://goteleport.com/docs/architecture/proxy/

are you saying that because you can have multiple proxies, they aren't centralized? or that at least this is one mode you can use, but the standard one is using a proxy?


Teleport consists of a couple of components:

* Proxy is used to handle SSO, Web UI and intercept traffic for session capture. You can have one proxy per your organization, multiple proxies or, if you don't want to intercept traffic, no proxies at all.

* Auth server is used to issue certificates and send audit logs and session recordings to external systems.

* Nodes (end system agents) sometimes are helpful, but not required. For example, if you want to capture system calls in your SSH session, you can deploy node. Or you can use OpenSSH with Teleport if you wish.

Because Teleport is based on certificate authorities, the following deployments are possible:

* One, "centralized" HA pair of proxies intercepting all your traffic (K8s, databases, web, etc). This is actually helpful for many cases, as you have just one entry point in your system to protect, vs many.

* Multiple, "decentralized" proxies in multiple datacenters. This is helpful for large organizations with many datacenters all over the world.

* No proxies at all. You can issue certificates with or without Teleport and reach your target clusters directly, as long as they trust the CA. It's a bit harder for managed K8s, but easy to do with self-hosted K8s, SSH, Databases, etc that support mTLS cert auth. This is super helpful for integrations with larger echo system - any system that supports cert auth should work with Teleport out of the box.

* You can have one auth server HA pair managing a single certificate authority.

* You can have multiple, independent auth servers (teleport clusters) with certificate authorities and trust established between them.

* You can use your own CA tooling with Teleport.

The way we think about Teleport is that it's a combination of certificate authority management system, proxies (intercepting traffic and recording sessions) and nodes (for some services, like SSH providing advanced auditing capabilities with BPF).

You can combine those components, or replace them with whatever makes sense.


does the standard deployment use a centralized proxy?

like it's your Basic Architecture in the diagram in your docs. so i feel like i'm being put on.


Sorry you feel that way!

We haven't counted, but my bet is that most smaller deployments just use the single proxy.

I also know that most larger deployments use multi-DC and multi-cluster design with independent CAs for availability and latency.


that answers my question perfectly. thank you!


As someone who is a big fan of Teleport, sorry, I just don't get it.

> Teleport doesn't provide identity provider integrations beyond GitHub (e.g. Okta) in their open source project

Right, and if you're a small team (5-10 people, like you're targeting) you don't really need SSO on the infra layer. It's a nice to have, it's best practice, but the truth is, by the time you really need it (enough engineers that account management is a pain), you typically have the budget for an Enterprise license.

> They have a different architecture that involves deploying a centralized proxy service (whereas Infra verifies credentials at the destination infrastructure vs at a central proxy).

So anyway you need to deploy something central to issue certificates. And anyway, if, to quote you, "We plan to make money by running a managed service version of Infra so teams don’t need to host and upgrade Infra manually.", isn't that the central proxy service? Yet the open-source version avoids it somehow?

> We plan to make money by running a managed service version of Infra so teams don’t need to host and upgrade Infra manually

So you want to sell to teams that a) are too small to afford the license for a product like Teleport Enterprise, b) have enough money that they can afford a premium product above and beyond the free offering provided by their Kubernetes vendor, like https://github.com/kubernetes-sigs/aws-iam-authenticator (for EKS), c) are willing to install and maintain another agent on their cluster (infra), but aren't willing to install and maintain the central proxy point?

> we've designed Infra around an extensible REST API from the start whereas Teleport uses GRPC.

This isn't really important from a product perspective. For what it's worth, Teleport started with a REST API; they moved to gRPC because, if I recall correctly, gRPC helped them scale to support larger infrastructure better.

If you're launching a competing product to Teleport, which is now by far the most mature product in the space, then currently, at least from where I'm sitting, you aren't offering sufficient added value compared to the incumbent offerings, which also include CloudFlare Access, Checkpoint Harmony Connect SASE, Hashicorp Boundary (their offerings aren't quite Kubernetes native, but it's the same idea)...


> you typically have the budget for an enterprise license.

Not all enterprises are the same and not all companies with more than 100 engineers are ready to dedicate a significant amount of capital to yearly costs for access control. Especially when you can "Make do" with an open source solution and spend the cash on a product that is less replaceable or more necessary. I would also add that this is the Only open-source solution that I've seen that would actually support blanket oidc integration and more specifically with Google workspace etc. Most competitors like Teleport, cloudflare, etc have proper oidc integration for an idp locked behind a pay wall. (Would love to know of any that dont)

> isn't that the central proxy service?

Teleport offers authentication AND a proxy that will let you connect back to your services via their proxy. The certificates that get issued for those backend services are usable as long as you can talk to the service but the proxy acts as an identity aware proxy locked behind your idp or whatever authentication you are using with teleport. From what I can tell infra does not offer a proxy to connect you back to your network. You would host it somewhere and expect users to be able to directly route to infra.internal.company and k8s.internal.company

IMO the fact that they are actually offering a fully open source product without locking any features behind a pay wall makes them worth watching. Obviously they aren't at parity with Teleport, and they don't support SSH or other protocols currently but I expect they'll have a lot of support in the community.


> Especially when you can "Make do" with an open source solution and spend the cash on a product that is less replaceable or more necessary

Ah, but you're getting to the crux of my (hopefully constructive) criticism. Ultimately the goal here isn't to create a useful open-source project and offer it for free. The goal is to open a business (OP is YC W21). That means having a business model where you a) do expect teams to pay you, and b) the number of teams and the amount of money they are willing to pay, in aggregate, is higher than the costs to develop the product.

If offering SSO as part of the open-source core provides enough value that customers do not need to pay you, then your business will fail. And then the open-source project will, in all likelihood, fail, without commercial backing behind it.

If the revenue plan is to sell a managed SaaS tenant, then the price for that managed SaaS tenant must be competitive with established offerings. Which means that it must be competitive with Teleport's managed offering, Cloudflare Access, cloud vendor tie-ins (e.g. IAM authenticator), etc. This sector has enough offerings that it is competitive and the price is quickly getting commoditized. That is not a good strategy for a startup that is not showing a 10x better product than the competition.


SSO on infrastructure is not a must for everyone but it’s a very nice thing to have. Teleport pricing for small teams doesn’t make sense, it’s more expensive than GitHub enterprise that provides SSO, and Infra is very welcome to provide basic features to everyone and not locked behind a “contact us” price.


hey, thank you for the comments.

We plan to build a managed service to provide a 'centralized experience'. This is where, we'd issue certificates/tokens for the users & machines. That being said, many of our users want to make sure that should Infra's server go down, their access will continually work for a configured time-interval. This is why we validate the credentials on the destination side.

Regarding gRPC, we actually started with that, and based on feedback to work with users' systems added REST API support.


Or dex (https://dexidp.io/) which seems to connect existing providers to Kubernetes (eg GitHub, LDAP)?


Great question! We looked heavily into Dex before creating Infra, and even spoke with their maintainers.

Dex is a federated OIDC provider. Most managed Kubernetes services (e.g. Azure AKS) don't support using custom OIDC providers for authentication and therefore can't easily be wired up to use Dex. Infra is designed to work with any Kubernetes distribution regardless of where it's hosted.

Even with self-hosted clusters that do support Dex, Dex doesn't manage authorization mappings (i.e. Kubernetes RBAC) for users and groups. Teams still need to manually create & remove RBAC roles for users and groups as they are added and removed from identity providers such as Okta. Infra can be configured to map roles for users and groups to Kubernetes clusters, and we're working to support dynamic provisioning protocols such as SCIM to make sure users are automatically revoked as they are removed from identity providers.


Congratulations on the launch! Any and all competition in this space is welcome. We've been hoping that something open-source would take off since the pricing for Teleport/StrongDM is a bit out of our league.

With that in mind, can you comment on your strategy for exposing arbitrary infrastructure (i.e. non-kubernetes-native services)? We've been watching Hashicorp Boundary for some time and it seems like they have the right architectural approach.. but the DX is not quite there and the pace of development is slow. It seems like Infra's approach is similar (install an agent on the target infrastructure which communicates with a central API), but it's unclear if this can be used across clusters, vpcs, cloud providers or to connect to non-kubernetes infrastructure.

A few other questions to distinguish it from Boundary:

1. What can infra do to get an organization closer to meeting SOC2 compliance?

2. Is there a plan for a desktop UI?

3. What sort of timeline are we looking at for the custom IdP integration (e.g. Azure OIDC)


Thanks for the note — would love to hear what infrastructure outside of Kubernetes you’d like to connect to, so I can target the answer more specifically.

For expanding to infrastructure outside of Kubernetes, we will ultimately generating short-lived credentials from and distributing them to either humans or machine users. This can be done through different means, and we’ll select the best method for that infrastructure.

1. With where we want to take Infra, we want to enable teams to build an access system using the principles of least privilege so statically users can have a minimal permission (or no permission), and when access is needed, dynamically provision that access (automatic or manual approval depending on configuration). Infra will always be able to show who has access to what at any given time to satisfy audits.

Infra is currently self-hosted, and we will definitely be taking it through compliance audits & penetration tests (including SOC 2) as we mature the project.

2. Yes, we love desktop apps - starting from the early versions of Kite we've built, Kitematic, Docker desktop, and Infra app. We believe good software just works in the background without affecting users' flow.

3. We’ve already tried using both Google workspaces / Azure active directory via OIDC. It should work. We’re just not pitching it because we have not done extensive testing with it yet. I should note, Infra is OIDC-compliant, so it should work with any custom identity providers that support OIDC. (I will asterisk it because one OIDC-compliant provider may differ from another, even from a major vendor)


regarding boundary, it's a great project but many times it requires too much to set up / manage.

For dynamic credentials, you can leverage Vault. For discovery of your infrastructure, you can use Consul.


I wish I had this at my previous job. Our k8s roles were basically everything or unusable. Managing anything fine grained was such a pain that it wasn't worth it.

Just went through the quickstart and that's some top-notch developer experience, great job.


Congrats Jeff, Michael and the whole Infra team!

I'm excited to see this space growing, it definitely needs attention and innovation. I've personally used several of the other systems mentioned in this thread, I think they all have strength and weaknesses, I do like the simplicity and thoughtfulness that went into Infra and I think the team is laser-focused on building a well thought solution with UX and operators overhead in mind.

I've been engaged with the project early on and had access to the team and to the product to try it and provide feedback, very impressed by how quickly the team iterates, their transparency and by their overall vision and how much the product have matured so quickly, obviously a huge supporter of them making Infra Open Source.

They were always open for feedback and taking action based on it, which I really appreciate!

Infra today is easy to install, integrate and already works seamlessly across 3 different K8s deployment methods in our sandbox/testing environments. We continue to work closely with Infra and consider expanding the install base soon.

I'm excited about what the team is building and strongly recommend anyone, regardless of team size to keep an eye and try it out.


Love to see projects that focus on the "boring" parts of getting infrastructure up and running. Something like this can save a bundle of time. This is a great project with a great team behind it!


I use rancher for this purpose alone (it can connect to okta, azure, local users...). I might consider using this with its plan to integrate more infra like databases and ssh. For ssh though I'm not sure it could easily replace FreeIPA, unless it integrates with it somehow.


Infra's designed to integrate with existing identity & access tooling (vs replace them). While we don't have a built-in integration for FreeIPA, Infra has a REST API and will integrate with any OpenID-compliant identity provider, meaning you shouldn't have to drop or replace your existing tooling to start using Infra.


super interesting


So awesome to see you both at it again!! Cheering on from the bleachers, let me know if there's anything I can do to help.


Thanks Nick! Brings back to our days together @ Docker :-)


The nostalgia is real, but, even brighter days ahead for you all!


I have spent _years_ trying to build things like this and only recently have made some headway. The problem with most solutions in the space is the inherent SPoF that comes with having them in the access path. As someone that has been woken up in the middle of the night to deal with outages within our access management systems, I _do not_ want the service that is in the path for an incident resolution to be a single point of failure that I then have to debug. Excited to see how this is built, and very excited to see where it goes.


So many teams have told us similar stories: that access is a "tier 1" service (i.e. it can't go down!). Most tools involving a single point of failure end up trading off reliability for security, whereas reliability is of equal (or sometimes even higher) importance to infrastructure & SRE teams. A great book that talks about this is https://sre.google/books/building-secure-reliable-systems/


Can we extend it to our custom IdP? We are building something equivalent to this internally and we don’t use Okta or Google IdP


hey, I'm one of the co-founders of Infra.

Under the hood, we support OIDC, and should be able to support custom IdPs. If you have specific requirements, definitely let me know.


Exciting to see a launch in infra access -- seems like there's a great team behind it, too. Congrats!


What are the benefits of Infra over using Dex to manage OIDC access to Kubernetes?


Dex doesn't support many managed Kubernetes services. It's because its OIDC support is not configurable. (ie. https://github.com/dexidp/dex/issues/1268) It is in EKS now, but you'd have to restart the whole control plane.


What configuration is required to set this up for managed kubernetes (AKS/EKS/GKE)? Do you need to make api server configuration changes?


Hey there! No changes to the api server configuration are required. We've designed Infra around this since AKS/EKS/GKE don't expose the ability to edit api server parameters to users.


How does the API server verify the user's token?


Tokens are verified by intercepting API server requests in-cluster against a central root of trust. This is similar to how OpenID tokens from identity providers such as Okta or Active Directory are verified by destination web applications. This works no matter where clusters are hosted (including GKE/AKS/EKS or self-hosted clusters).


How do you avoid configuring the API server to support OIDC?

https://kubernetes.io/docs/reference/access-authn-authz/auth...


Seems like you intercept the request and use an admin service account token then impersonate?


Great question! Most managed Kubernetes services don't support OIDC (and for EKS, which does support custom OIDC providers, it requires restarting the entire control plane to edit the configuration).

Infra runs a lightweight process in-cluster that intercepts requests and verifies them - and yes, this process intercepts requests and then impersonates the correct users and groups.


Congrats on the launch, Jeff and Michael!


At first I thought this was a relaunch of Infra App[0], because that project has been stale for a year now since they introduced licensing and has been superseded by Lens feature wise anyways.

This looks very interesting, will give it a try.

[0] https://infra.app


Thanks for checking out Infra!

Infra App is Jeff and I's passion project when we started. We still patch it for security / bugs. That being said, we've definitely been thinking about how we should maintain / let the rest of the community take it over if there is such an interest. If anyone wants to chat about that, ping me at michael -at- infrahq.com


Can you login from a "headless" terminal or is a browser required during the workflow?


Yes, if you use Infra's local users, you can sign-in headless.


The connector.config.skipTLSVerify=true in the quick start document is a bit scary. A secure setup may be a bit more complex but your product is not targeting beginners, so a correct tls configuration may not scare people away.


thank you for this feedback. We've made several edits to the quickstart to get users started as quickly as possible as proof-of-concept installs either on a test cluster in the cloud or a local cluster.

For the a longer setup: https://infrahq.com/docs/install/install-on-kubernetes

We will definitely better address this in the future. Thank you for pointing this out.


Congrats Jeff and Michael!


Lovely!! Congratulations on the launch! This seems really neat


This is the coolest HN release I've seen in a while!


Congrats on the launch Jeff and Michael! :)


Congrats!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: