Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Digger – Open Source Terraform automation and collaboration tool (github.com/diggerhq)
91 points by ujnproduct on July 9, 2023 | hide | past | favorite | 23 comments



One of the major security issues with running terraform in your CI/CD pipeline is that it usually needs admin permissions to your entire cloud environment. To avoid this you need the pipeline to pass parameters to an internal process that actually applies the changes.

Digger makes it sound like it might address this:

> Digger runs terraform natively in your CI. This is: Secure, because cloud access secrets aren't shared with a third-party

From the Github+AWS demo:

> 4. Add environment variables into your Github Action Secrets (cloud keys are a requirement since digger needs to connect to your account for coordinating locks) AWS_ACCESS_KEY_ID & AWS_SECRET_ACCESS_KEY

It sure looks like AWS admin credentials are shared with Github, and also available to anything else in the diggerhq/digger action.


Yeah, weird for them to do that. Managing credentials like that sucks even from an ergonomics standpoint.

In practice, it's pretty normal to use OIDC to authenticate Github Actions to AWS:

https://docs.github.com/en/actions/deployment/security-harde...


Ok yeah, looks like they recently added OIDC support: https://docs.digger.dev/cloud-providers/authenticating-with-...

They should update the main readme to include this under Features, and also call it out in the demo files.


One of the founders here You could also use OIDC so no need to share keys

https://docs.digger.dev/cloud-providers/authenticating-with-...


> It sure looks like AWS admin credentials are shared with Github, and also available to anything else in the diggerhq/digger action

I am a co-founder of Terrateam[0] which is a Terraform CI/CD as well. At the end of the day, you need to execute something to do these operations and having this component open source is important for auditing purposes. For Terrateam, we lean heavily into GitHub Actions so GitHub is at least managing any secrets and runs. One challenge is users could pin the Action that we publish to a specific version, but we also update it regularly and communicating to customers to update it is a challenge.

[0] https://terrateam.io


The only IAM-safe way is to run context-aware terraform plans so the environments cannot ever CRUD out of scope. For example, an application-centric approach might use an ABAC constraint and temporary credentials (perhaps via OIDC, but most OIDC integrations lack local privilege separation; instance roles are far more secure) and making sure events are bound to the environment they are allowed to be executed in.

This does require something that should essentially be embedded in your environment or account vending machine, otherwise it becomes very cumbersome to maintain.


Any CD is going to require some kind of authentication key. To minimize the surface area of a potential leak, create a user in AWS for the tool, only grant it access to the resources needed, and then create a key for that user to place in your CI. You should also enable audit trails in your AWS account so you can monitor for unusual activity.


I do similar with K8S and RBAC. The most common action in a repo is going to be to update a deployment with a new image or resource config, so that’s all it can do.

Still need a more permissive role to manage the cluster in other ways but you can isolate that and limit access to its repo.


You should create a role, not a user IMO. Also how do you manage that role/user? Via terraform?


I'm surprised nobody has mentioned Atlantis yet. Running bare terraform in CI is a bad idea (to the extent that running an 'expect' script for an interactive tool is a bad idea), and when you consider the impact it can have (both on resources and on escalation) it should be out-of-band anyway.


Atlantis was a great tool back in the day and still works well in most scenarios. The main issue with it is that it also takes on running the jobs (as in Terraform binary runs on the same VM it runs). Which makes it similar to Jenkins and other first-generation CI systems.

Companies that use Atlantis at scale (eg Lyft) felt the need to fork it and use a scalable compute backend instead, eg Temporal. At which point you've basically got a DIY in-house CI.

Our view is that it's best to keep matters separate. The CI part with compute, jobs, logs etc is a solved problem. What's unsolved for Terraform is state-aware logic when / how to run those jobs. It's all about the orchestrator really.


I think in my case (and almost everyone else's case) we'll never go Lyft-scale, but with about 100 AWS accounts (and a bunch of Cloudflare, On-Prem [Compute, Networking], GitHub and other providers) and 300 terraform environments we haven't had the problem you described yet.

To us, CI is about integration while Terraform is about reconciliation. Technically both could be categorised as 'jobs', but by that metric, a CD event is also just a job, and so is a migration for an RDBMS and adding and removing products from inventory. But we don't call them jobs, because their specialisation warrants specialised handling. To be fair, we aren't based in the US so perhaps it's more of a localised thing.


They migrated from Python to Golang

More Detailed Here: https://old.reddit.com/r/golang/comments/14rduec/we_rewrote_...



I was initially very interested in digger, because I need something like atlantis but the thought of a web-accesible server with owner-level access to my project seemed scary. Having everything in cicd seemed like a great solution. However, when I read the digger docs, I discovered that it too has a publicly accessible server, that gets autodeployed when you first run digger.

1. I don't like the idea of the tool creating resources I didn't explicitly tell it to create

2. I don't like the idea of a public endpoint for someone to pwn and get owner-level access of all my stuff.

It would be nice if the docs explained what the serverless backend thing does (besides the vague comment about handling webhooks), and it would be nice if there was an option that didn't require the public backend even if it means slightly degraded functionality. (github actions can be triggered by PR opened, PR updated, comment created, comment edited, merge to main, and many other things. Seems to me like that should be enough?)

https://docs.digger.dev/readme/how-it-works


All valid points! Thank you!

We were initially completely backend-less; but then it increasingly became apparent that a central orchestrator is unavoidable.

Rationale here: https://diggerdev.notion.site/Why-digger-introduces-an-orche...

In hindsight, it makes sense that literally every single other tool in the space has a central backend that orchestrates jobs. There's a good reason for that.

To address security / access concerns, you can either self-host the orchestrator, or use OIDC, or both


I misread it as Dagger, the CI/CD tool (https://dagger.io)


Yeah naming is fun

The most fun thing is - Digger + Dagger could be a great combo! We haven't yet explored properly but in theory it shouldn't be anything different from adding another CI provider; we already support GitHub Actions, Gitlab CI and Azure DevOps


I saw Digger and got excited for a second...

https://en.wikipedia.org/wiki/Digger_(video_game)


I don't know how but in the seconds it took me to read the title and click the link my brain went down this crazy hole of a 3d tool that could take some point cloud data or image scans and allow you to "dig" thru virtual earth and shape the land or something.. oh boy.


One of the main reasons for us to use a terraform collaboration tool is to easily manage state files.

Would be awesome if they find a way to integrate state management.


Thanks!! Great point; for now we're relying on S3+dynamo which many people prefer anyways; but state management is on the roadmap, we'll get to it soon

Tracking here: https://github.com/diggerhq/digger/issues/206

And btw contributions very welcome, we're a small team so every bit helps, even if it's just filing or labeling an issue


> S3+dynamo which many people prefer anyways

Those people have not experienced the :heart_eyes_cat: of GitLab's TF state store, which I find just a bazillion times superior to creating TWO separate AWS resources only for storing a bunch of JSON to make TF work: https://docs.gitlab.com/ee/user/infrastructure/iac/terraform...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: