Launch HN: Infra (YC W21) – Open-source access management for Kubernetes

cxj22 · on May 18, 2022

I work at a company that provides a SaaS platform for running Kubernetes clusters across edge, private and public clouds, and IAM for cloud infrastructure is a massive issue. My team and I have spent significant amounts of time digging into this issue and connected with infra over 12 months ago. This as a two layer problem; provide consistency across clouds for DevOps and platform teams, and ease of use for users.

Infra is solving this very complex issue by addressing both and I like the approach the team has taken.

Simplifying IAM for platform teams whilst importantly maintaining native controls such as RBAC with multiple clusters using a single distribution of Kubernetes is crucial. Any layer on top of Kubernetes RBAC makes it impossible to remain open and portable. Solving this across different clouds, be it the hyperscale providers or any on-prem DIY, is even more complex, OIDC is one example.

Further, issues arise when you want to provide developers self-service access to clusters. Current in-market options are limited or require separate tools for separate clouds, AWS IAM for example, or result in further in-house/DIY development.

Can't wait to see where this goes next.

candiddevmike · on May 17, 2022

Why should someone use this instead of Vault, Teleport, or other PAM tools? Does it work with managed Kubernetes offerings like AKS/EKS/GKE?

jmorgan · on May 17, 2022

Thanks for the questions! Vault doesn't have a deep integration to generate credentials for Kubernetes, and Infra plugs in to users' tooling (e.g. kubectl and Kubeconfig) to keep credentials up to date automatically.

Infra's different than Teleport in a few ways. Teleport doesn't provide identity provider integrations beyond GitHub (e.g. Okta) in their open source project. They have a different architecture that involves deploying a centralized proxy service (whereas Infra verifies credentials at the destination infrastructure vs at a central proxy). Further, we've designed Infra around an extensible REST API from the start whereas Teleport uses GRPC.

Infra does work with managed Kubernetes services like AKS/EKS/GKE! It will also work with self-hosted Kubernetes clusters.

alexk · on May 17, 2022

Sasha, CTO@Teleport here. Congrats on the launch!

RE: Teleport design

Teleport does not require a centralized proxy, because it is based on certificate authorities. You can issue a certificate with or without Teleport proxy and access any cluster that trusts that certificate directly.

Because of this design you can have a completely decentralized system, with cold storage for your CA, HSM or any parallel system issuing certificates. There is also no need to revoke your credentials, because your certs are short-lived and bound to the device and cluster, so there is less opportunity for pivot attacks.

RE: GRPC

First version of Teleport also had HTTP/JSON REST API, but we have migrated to GRPC to support events streaming and have one type system across multiple languages and services boundaries.

Re: Managed clusters

Teleport supports all CNCF-compatible clusters, including AKS, EKS and GKE out of the box.

jmorgan · on May 17, 2022

Great point on GRPC having better support for event streaming! We originally built Infra to have a GRPC API, but many users we spoke to didn't yet have load balancers or ingress controllers that supported the GRPC protocol (e.g. one user had to consider upgrading their AWS Load Balancer controller to put Infra behind it).

We wanted to remove as many hurdles as possible for teams to deploy Infra in their environments. Event streaming will invariably become an important part of the API (e.g. for features like audit logs), and we'll consider GRPC again for internal components of Infra.

RE using Teleport without the proxy, how would a target cluster's Kubernetes API server (e.g. an EKS cluster) verify certificates without Teleport's proxy?

solatic · on May 17, 2022

> one user had to consider upgrading their AWS Load Balancer controller to put Infra behind it

Huh?

The AWS load balancer for which gRPC is relevant is their Application Load Balancer (ALB), which would require you to terminate TLS at the ALB and does not support mutual TLS (which is how short-lifetime client certificates work in this case). To the best of my knowledge, you can't pass through a client-key-encrypted gRPC session through an ALB (maybe I'm wrong?).

Typically this requires an NLB, which will treat all TCP traffic (REST and gRPC) the same, so gRPC wouldn't require an upgrade?

alexk · on May 18, 2022

Re: GRPC

My bet that you'd migrate to GRPC eventually as you scale :) I like the simplicity of HTTPS/JSON API as well, but it just broke down for us at a certain scale point.

Re: Teleport with EKS

True, CNCF clusters support mTLS out of the box, but EKS hides the endpoint and does not let you provision CA to trust. You will have to run teleport proxy inside the EKS cluster to translate mTLS to EKS IAM auth. However, you don't have to have a centralized proxy, you can just deploy Teleport proxy agent in each cluster and hide your K8s endpoint.

You also don't have to have a single Teleport proxy to do that.

jmorgan · on May 18, 2022

Thanks! Curious, where did HTTP+JSON break down for you? Was it specifically around audit/event streaming? This would be helpful as we consider building out future updates to Infra, especially considering tools like Kubernetes have put HTTP+JSON APIs the test (at least in their user-facing APIs)

Indeed! EKS + others don't allow custom authentication methods or allow you to use an external CA for the cluster. Running a proxy agent in each cluster makes sense and is similar to how Infra approaches it: I hadn't seen that configuration in your architecture pages!

Have you considered distributing certificates signed by the cluster CA itself (to avoid proxies altogether)? In 1.22 onwards there's a new ExpirationSeconds field when creating a certificate signing request: https://github.com/kubernetes/enhancements/issues/2784 . I imagine this will be supported by all the hosted Kubernetes services - we've been watching this closely.

tessier0ashpool · on May 17, 2022

this looks like a centralized proxy to me: https://goteleport.com/docs/architecture/proxy/

are you saying that because you can have multiple proxies, they aren't centralized? or that at least this is one mode you can use, but the standard one is using a proxy?

alexk · on May 17, 2022

Teleport consists of a couple of components:

* Proxy is used to handle SSO, Web UI and intercept traffic for session capture. You can have one proxy per your organization, multiple proxies or, if you don't want to intercept traffic, no proxies at all.

* Auth server is used to issue certificates and send audit logs and session recordings to external systems.

* Nodes (end system agents) sometimes are helpful, but not required. For example, if you want to capture system calls in your SSH session, you can deploy node. Or you can use OpenSSH with Teleport if you wish.

Because Teleport is based on certificate authorities, the following deployments are possible:

* One, "centralized" HA pair of proxies intercepting all your traffic (K8s, databases, web, etc). This is actually helpful for many cases, as you have just one entry point in your system to protect, vs many.

* Multiple, "decentralized" proxies in multiple datacenters. This is helpful for large organizations with many datacenters all over the world.

* No proxies at all. You can issue certificates with or without Teleport and reach your target clusters directly, as long as they trust the CA. It's a bit harder for managed K8s, but easy to do with self-hosted K8s, SSH, Databases, etc that support mTLS cert auth. This is super helpful for integrations with larger echo system - any system that supports cert auth should work with Teleport out of the box.

* You can have one auth server HA pair managing a single certificate authority.

* You can have multiple, independent auth servers (teleport clusters) with certificate authorities and trust established between them.

* You can use your own CA tooling with Teleport.

The way we think about Teleport is that it's a combination of certificate authority management system, proxies (intercepting traffic and recording sessions) and nodes (for some services, like SSH providing advanced auditing capabilities with BPF).

You can combine those components, or replace them with whatever makes sense.

tessier0ashpool · on May 18, 2022

does the standard deployment use a centralized proxy?

like it's your Basic Architecture in the diagram in your docs. so i feel like i'm being put on.

alexk · on May 18, 2022

Sorry you feel that way!

We haven't counted, but my bet is that most smaller deployments just use the single proxy.

I also know that most larger deployments use multi-DC and multi-cluster design with independent CAs for availability and latency.

tessier0ashpool · on May 18, 2022

that answers my question perfectly. thank you!

solatic · on May 17, 2022

As someone who is a big fan of Teleport, sorry, I just don't get it.

> Teleport doesn't provide identity provider integrations beyond GitHub (e.g. Okta) in their open source project

Right, and if you're a small team (5-10 people, like you're targeting) you don't really need SSO on the infra layer. It's a nice to have, it's best practice, but the truth is, by the time you really need it (enough engineers that account management is a pain), you typically have the budget for an Enterprise license.

> They have a different architecture that involves deploying a centralized proxy service (whereas Infra verifies credentials at the destination infrastructure vs at a central proxy).

So anyway you need to deploy something central to issue certificates. And anyway, if, to quote you, "We plan to make money by running a managed service version of Infra so teams don’t need to host and upgrade Infra manually.", isn't that the central proxy service? Yet the open-source version avoids it somehow?

> We plan to make money by running a managed service version of Infra so teams don’t need to host and upgrade Infra manually

So you want to sell to teams that a) are too small to afford the license for a product like Teleport Enterprise, b) have enough money that they can afford a premium product above and beyond the free offering provided by their Kubernetes vendor, like https://github.com/kubernetes-sigs/aws-iam-authenticator (for EKS), c) are willing to install and maintain another agent on their cluster (infra), but aren't willing to install and maintain the central proxy point?

> we've designed Infra around an extensible REST API from the start whereas Teleport uses GRPC.

This isn't really important from a product perspective. For what it's worth, Teleport started with a REST API; they moved to gRPC because, if I recall correctly, gRPC helped them scale to support larger infrastructure better.

If you're launching a competing product to Teleport, which is now by far the most mature product in the space, then currently, at least from where I'm sitting, you aren't offering sufficient added value compared to the incumbent offerings, which also include CloudFlare Access, Checkpoint Harmony Connect SASE, Hashicorp Boundary (their offerings aren't quite Kubernetes native, but it's the same idea)...

InitialBP · on May 18, 2022

> you typically have the budget for an enterprise license.

Not all enterprises are the same and not all companies with more than 100 engineers are ready to dedicate a significant amount of capital to yearly costs for access control. Especially when you can "Make do" with an open source solution and spend the cash on a product that is less replaceable or more necessary. I would also add that this is the Only open-source solution that I've seen that would actually support blanket oidc integration and more specifically with Google workspace etc. Most competitors like Teleport, cloudflare, etc have proper oidc integration for an idp locked behind a pay wall. (Would love to know of any that dont)

> isn't that the central proxy service?

Teleport offers authentication AND a proxy that will let you connect back to your services via their proxy. The certificates that get issued for those backend services are usable as long as you can talk to the service but the proxy acts as an identity aware proxy locked behind your idp or whatever authentication you are using with teleport. From what I can tell infra does not offer a proxy to connect you back to your network. You would host it somewhere and expect users to be able to directly route to infra.internal.company and k8s.internal.company

IMO the fact that they are actually offering a fully open source product without locking any features behind a pay wall makes them worth watching. Obviously they aren't at parity with Teleport, and they don't support SSH or other protocols currently but I expect they'll have a lot of support in the community.

solatic · on May 18, 2022

> Especially when you can "Make do" with an open source solution and spend the cash on a product that is less replaceable or more necessary

Ah, but you're getting to the crux of my (hopefully constructive) criticism. Ultimately the goal here isn't to create a useful open-source project and offer it for free. The goal is to open a business (OP is YC W21). That means having a business model where you a) do expect teams to pay you, and b) the number of teams and the amount of money they are willing to pay, in aggregate, is higher than the costs to develop the product.

If offering SSO as part of the open-source core provides enough value that customers do not need to pay you, then your business will fail. And then the open-source project will, in all likelihood, fail, without commercial backing behind it.

If the revenue plan is to sell a managed SaaS tenant, then the price for that managed SaaS tenant must be competitive with established offerings. Which means that it must be competitive with Teleport's managed offering, Cloudflare Access, cloud vendor tie-ins (e.g. IAM authenticator), etc. This sector has enough offerings that it is competitive and the price is quickly getting commoditized. That is not a good strategy for a startup that is not showing a 10x better product than the competition.

speedgoose · on May 18, 2022

SSO on infrastructure is not a must for everyone but it’s a very nice thing to have. Teleport pricing for small teams doesn’t make sense, it’s more expensive than GitHub enterprise that provides SSO, and Infra is very welcome to provide basic features to everyone and not locked behind a “contact us” price.

mchiang · on May 17, 2022

hey, thank you for the comments.

We plan to build a managed service to provide a 'centralized experience'. This is where, we'd issue certificates/tokens for the users & machines. That being said, many of our users want to make sure that should Infra's server go down, their access will continually work for a configured time-interval. This is why we validate the credentials on the destination side.

Regarding gRPC, we actually started with that, and based on feedback to work with users' systems added REST API support.

remram · on May 18, 2022

Or dex (https://dexidp.io/) which seems to connect existing providers to Kubernetes (eg GitHub, LDAP)?

jmorgan · on May 18, 2022

Great question! We looked heavily into Dex before creating Infra, and even spoke with their maintainers.

Dex is a federated OIDC provider. Most managed Kubernetes services (e.g. Azure AKS) don't support using custom OIDC providers for authentication and therefore can't easily be wired up to use Dex. Infra is designed to work with any Kubernetes distribution regardless of where it's hosted.

Even with self-hosted clusters that do support Dex, Dex doesn't manage authorization mappings (i.e. Kubernetes RBAC) for users and groups. Teams still need to manually create & remove RBAC roles for users and groups as they are added and removed from identity providers such as Okta. Infra can be configured to map roles for users and groups to Kubernetes clusters, and we're working to support dynamic provisioning protocols such as SCIM to make sure users are automatically revoked as they are removed from identity providers.

infratemp2 · on May 18, 2022

Congratulations on the launch! Any and all competition in this space is welcome. We've been hoping that something open-source would take off since the pricing for Teleport/StrongDM is a bit out of our league.

With that in mind, can you comment on your strategy for exposing arbitrary infrastructure (i.e. non-kubernetes-native services)? We've been watching Hashicorp Boundary for some time and it seems like they have the right architectural approach.. but the DX is not quite there and the pace of development is slow. It seems like Infra's approach is similar (install an agent on the target infrastructure which communicates with a central API), but it's unclear if this can be used across clusters, vpcs, cloud providers or to connect to non-kubernetes infrastructure.

A few other questions to distinguish it from Boundary:

1. What can infra do to get an organization closer to meeting SOC2 compliance?

2. Is there a plan for a desktop UI?

3. What sort of timeline are we looking at for the custom IdP integration (e.g. Azure OIDC)

mchiang · on May 18, 2022

Thanks for the note — would love to hear what infrastructure outside of Kubernetes you’d like to connect to, so I can target the answer more specifically.

For expanding to infrastructure outside of Kubernetes, we will ultimately generating short-lived credentials from and distributing them to either humans or machine users. This can be done through different means, and we’ll select the best method for that infrastructure.

1. With where we want to take Infra, we want to enable teams to build an access system using the principles of least privilege so statically users can have a minimal permission (or no permission), and when access is needed, dynamically provision that access (automatic or manual approval depending on configuration). Infra will always be able to show who has access to what at any given time to satisfy audits.

Infra is currently self-hosted, and we will definitely be taking it through compliance audits & penetration tests (including SOC 2) as we mature the project.

2. Yes, we love desktop apps - starting from the early versions of Kite we've built, Kitematic, Docker desktop, and Infra app. We believe good software just works in the background without affecting users' flow.

3. We’ve already tried using both Google workspaces / Azure active directory via OIDC. It should work. We’re just not pitching it because we have not done extensive testing with it yet. I should note, Infra is OIDC-compliant, so it should work with any custom identity providers that support OIDC. (I will asterisk it because one OIDC-compliant provider may differ from another, even from a major vendor)

mchiang · on May 18, 2022

regarding boundary, it's a great project but many times it requires too much to set up / manage.

For dynamic credentials, you can leverage Vault. For discovery of your infrastructure, you can use Consul.

koahmad · on May 18, 2022

I wish I had this at my previous job. Our k8s roles were basically everything or unusable. Managing anything fine grained was such a pain that it wasn't worth it.

Just went through the quickstart and that's some top-notch developer experience, great job.

dannyk81 · on May 20, 2022

Congrats Jeff, Michael and the whole Infra team!

I'm excited to see this space growing, it definitely needs attention and innovation. I've personally used several of the other systems mentioned in this thread, I think they all have strength and weaknesses, I do like the simplicity and thoughtfulness that went into Infra and I think the team is laser-focused on building a well thought solution with UX and operators overhead in mind.

I've been engaged with the project early on and had access to the team and to the product to try it and provide feedback, very impressed by how quickly the team iterates, their transparency and by their overall vision and how much the product have matured so quickly, obviously a huge supporter of them making Infra Open Source.

They were always open for feedback and taking action based on it, which I really appreciate!

Infra today is easy to install, integrate and already works seamlessly across 3 different K8s deployment methods in our sandbox/testing environments. We continue to work closely with Infra and consider expanding the install base soon.

I'm excited about what the team is building and strongly recommend anyone, regardless of team size to keep an eye and try it out.

stevvooe · on May 18, 2022

Love to see projects that focus on the "boring" parts of getting infrastructure up and running. Something like this can save a bundle of time. This is a great project with a great team behind it!

jaimehrubiks · on May 17, 2022

I use rancher for this purpose alone (it can connect to okta, azure, local users...). I might consider using this with its plan to integrate more infra like databases and ssh. For ssh though I'm not sure it could easily replace FreeIPA, unless it integrates with it somehow.

jmorgan · on May 17, 2022

Infra's designed to integrate with existing identity & access tooling (vs replace them). While we don't have a built-in integration for FreeIPA, Infra has a REST API and will integrate with any OpenID-compliant identity provider, meaning you shouldn't have to drop or replace your existing tooling to start using Infra.

jaren · on May 18, 2022

super interesting

nickstinemates · on May 18, 2022

So awesome to see you both at it again!! Cheering on from the bleachers, let me know if there's anything I can do to help.

jmorgan · on May 18, 2022

Thanks Nick! Brings back to our days together @ Docker :-)

nickstinemates · on May 18, 2022

The nostalgia is real, but, even brighter days ahead for you all!

kr3 · on May 18, 2022

I have spent _years_ trying to build things like this and only recently have made some headway. The problem with most solutions in the space is the inherent SPoF that comes with having them in the access path. As someone that has been woken up in the middle of the night to deal with outages within our access management systems, I _do not_ want the service that is in the path for an incident resolution to be a single point of failure that I then have to debug. Excited to see how this is built, and very excited to see where it goes.

jmorgan · on May 18, 2022

So many teams have told us similar stories: that access is a "tier 1" service (i.e. it can't go down!). Most tools involving a single point of failure end up trading off reliability for security, whereas reliability is of equal (or sometimes even higher) importance to infrastructure & SRE teams. A great book that talks about this is https://sre.google/books/building-secure-reliable-systems/

throwawayk8s · on May 17, 2022

Can we extend it to our custom IdP? We are building something equivalent to this internally and we don’t use Okta or Google IdP

mchiang · on May 17, 2022

hey, I'm one of the co-founders of Infra.

Under the hood, we support OIDC, and should be able to support custom IdPs. If you have specific requirements, definitely let me know.

jag729 · on May 18, 2022

Exciting to see a launch in infra access -- seems like there's a great team behind it, too. Congrats!

0xEFF · on May 17, 2022

What are the benefits of Infra over using Dex to manage OIDC access to Kubernetes?

mchiang · on May 17, 2022

Dex doesn't support many managed Kubernetes services. It's because its OIDC support is not configurable. (ie. https://github.com/dexidp/dex/issues/1268) It is in EKS now, but you'd have to restart the whole control plane.

AETackaberry · on May 18, 2022

What configuration is required to set this up for managed kubernetes (AKS/EKS/GKE)? Do you need to make api server configuration changes?

jmorgan · on May 18, 2022

Hey there! No changes to the api server configuration are required. We've designed Infra around this since AKS/EKS/GKE don't expose the ability to edit api server parameters to users.

AETackaberry · on May 18, 2022

How does the API server verify the user's token?

jmorgan · on May 18, 2022

Tokens are verified by intercepting API server requests in-cluster against a central root of trust. This is similar to how OpenID tokens from identity providers such as Okta or Active Directory are verified by destination web applications. This works no matter where clusters are hosted (including GKE/AKS/EKS or self-hosted clusters).

AETackaberry · on May 18, 2022

How do you avoid configuring the API server to support OIDC?

https://kubernetes.io/docs/reference/access-authn-authz/auth...

AETackaberry · on May 18, 2022

Seems like you intercept the request and use an admin service account token then impersonate?

jmorgan · on May 18, 2022

Great question! Most managed Kubernetes services don't support OIDC (and for EKS, which does support custom OIDC providers, it requires restarting the entire control plane to edit the configuration).

Infra runs a lightweight process in-cluster that intercepts requests and verifies them - and yes, this process intercepts requests and then impersonates the correct users and groups.

innerzeal · on May 18, 2022

Congrats on the launch, Jeff and Michael!

aequitas · on May 18, 2022

At first I thought this was a relaunch of Infra App[0], because that project has been stale for a year now since they introduced licensing and has been superseded by Lens feature wise anyways.

This looks very interesting, will give it a try.

[0] https://infra.app

mchiang · on May 18, 2022

Thanks for checking out Infra!

Infra App is Jeff and I's passion project when we started. We still patch it for security / bugs. That being said, we've definitely been thinking about how we should maintain / let the rest of the community take it over if there is such an interest. If anyone wants to chat about that, ping me at michael -at- infrahq.com

moondev · on May 18, 2022

Can you login from a "headless" terminal or is a browser required during the workflow?

mchiang · on May 18, 2022

Yes, if you use Infra's local users, you can sign-in headless.

speedgoose · on May 18, 2022

The connector.config.skipTLSVerify=true in the quick start document is a bit scary. A secure setup may be a bit more complex but your product is not targeting beginners, so a correct tls configuration may not scare people away.

mchiang · on May 18, 2022

thank you for this feedback. We've made several edits to the quickstart to get users started as quickly as possible as proof-of-concept installs either on a test cluster in the cloud or a local cluster.

For the a longer setup: https://infrahq.com/docs/install/install-on-kubernetes

We will definitely better address this in the future. Thank you for pointing this out.

zenlikethat · on May 18, 2022

Congrats Jeff and Michael!

willietran · on May 18, 2022

Lovely!! Congratulations on the launch! This seems really neat

sonny3690 · on May 18, 2022

This is the coolest HN release I've seen in a while!

shrisukhani · on May 18, 2022

Congrats on the launch Jeff and Michael! :)

neemsio · on May 18, 2022

Congrats!