Cue is a project originally started by Marcel van Lohuizen who previously was part of BCL (Borg Config Lang) at Google. The main use is to generate config files.
A very interesting development is that Grafana appears to be adopting Cue as a first-class configuration option. See: "Bring new CUE-based config schema system to release-readiness" https://github.com/grafana/grafana/issues/33139
This could mean that a future where Grafana dashboards can be two-way synced with a git repo will eventually exist.
----
Other tools with some industry adoption in the "Infrastructure as Code" space include
- Dhall
- Jsonnet (from BCL)
- kustomize
- Helm
- kubecfg
- Tanka
- SkyCfg
- jkcfg
- Krane
- HCL (Terraform)
And two tools that fall into a separate class of enabling "Infrastructure as Software"
re: Grafana (i'm the author of the linked issue) - i'm quite excited, i do think there's a world of possibilities here.
Two-way sync with a git repo is one possible path, and we've talked a lot internally at GL about how to best support it. My sense is that we can do it with relatively little friction and likely will - but if you're just syncing with a git repo, there's still a lot of arbitrary, opaque repo layout decisions that still have to be made (how do you map a filesystem position for a dashboard to a position in Grafana? In a way that places the dashboards next to the systems they're intended to observe? With many teams? With many Grafana instances?) which induce new kinds of friction at scale.
Fortunately - and not mutually exclusively with the above - by building the system for schema in CUE, we've made a composable thing that we can make into larger systems. That's what we're starting to do with Polly: https://github.com/pollypkg/polly
This seems really exciting. I haven't had the chance to use Grafana yet; from the linked issue, am I understanding correctly that you'll be able to serialize dashboards to Cue schema, and hence get all the niceties of a structured representation - versioning, non-visual editing, and reproducibility?
I recall seeing another project HN which created dashboards out of a yaml description. This seems like a fantastic idea, given that a lot of business panels and dashboard apps can be implemented with a limited set of UI interactions.
Yep, this was the one! It looks great though I hadn't got around to trying it.
Do you think something like Cue makes for a better representation for lowdefy apps than YAML, since it seems to offer better abstraction ability and hence easier to compose?
It's an interesting thought. I'll definitely spend some time to consider how this could work.
Although, writing apps in yaml or json works really well currently. We mostly express logic through Lowdefy operators. We like supporting yaml / json since it is easy to write code to create or update such apps in any language.
That said, it's not like we're planning on replacing all the "export JSON" buttons in the Grafana UI with "export CUE." One of the interesting properties of defining schemas in CUE is how it allows us to remove schema-defined default values from a dashboard's JSON. The JSON representation can actually look a lot more like a concise CUE representation.
> versioning
Versioning of the Grafana schema is the essential design goal of the "scuemata" system that is under discussion in that epic issue
Like, editing something other than raw code in an editor? Yes, this is also something directly enabled by the schema (again, see the Polly doc, the "Produce" heading). For data-intensive tasks, such editing experiences are the only way to see your logic in the context of data, and therefore IMO prerequisite for confidence
> reproducibility
Yup. This already isn't "hard" to do today, but reproducibility gets more complicated at scale - those questions about how to map what's on disk to what's in your Grafana (or whatever app) in my parent comment become more complicated, leading to friction, leading to staleness.
It might sound a bit pedantic, but kustomize strictly avoids the "Infrastructure as Code" space and stays in the "Infrastructure as Data" space. The main difference is that since it just deals with "data", you can build any higher level tooling on this. One of the major proponents of this idea is Brian Grant from Google. He tweets about this from time to time. Here is a recent one: https://twitter.com/bgrant0607/status/1404461906186833927
Is this distinction really about whether the customisation language is declarative? It seems to me that Dhall has the advantages Brian Grant attributes to "Infrastructure as Data", although it is an executable specification.
Thanks for the "why cue" posts. The two key points appear to be inheritance vs. unification and nothing vs. typed. Somehow I'm unable to grok why unification is better than inheritance. Going a bit deeper:
* "Inheritance, is not commutative and idempotent in the general case"
* "A value is always final in CUE, it can only be made more specific."
From an engineering perspective, the latter is definitely more appealing. But I lack well articulated stories to understand how inheritance fails short, and how graph unification fares better. I wonder if there is somewhere a simple concrete example to contrast the not-idempotent inheritance approach vs. the graph unification approach.
I believe they're discussing commutation and idempotency in the sense of types, rather than the sense of values.
Inheritance allows you to override properties/attributes. If you inherit from 2 classes that both specify the same attribute/property, but with different types for the same attribute, one of them takes precedence and overrides the other. A inherits from B inherits from C is not the same as A inherits from C inherits from B if C says attribute X is a string and B says attribute X is an int.
From my understanding, the equivalent graph unification is invalid. If type A is a unification of type B and C, then B and C cannot have any overlap. Each property is either a member of B or a member of C, but never both. It's commutative because A = B | C (A is the unification of B and C) is the same as A = C | B (A is the unification of C and B). If x is a member of B, and I access A.x, I will always end up accessing B. With inheritance, there can be a B.x and a C.x. Which one I end up accessing depends on which one is A's parent.
Inheritance is not idempotent because if A inherits from B inherits from C, then A is implicitly also B and C. However, A can override B's and C's behavior, so I can't trust that calling C.x will always return the same value. It might return the type C has for that attribute, it might return the type B has for that attribute or it might return the type C has for that attribute. You can prevent overriding the types in children, but at that point you've basically built graph unification.
To give a concrete example, Python allows inheritance. If we are provided with this:
class MyCar:
# Epoch time for when the car was made
created_at: int
class MyCarV2(MyCar):
# Time it was created in RFC3339 format
created_at: str
class MyCarV3(MyCarV2):
# Using an actual datetime object
created_at: datetime.datetime
That function has no idea what the type of car.created_at will be. Mypy will complain at you because it's bad practice, but it's valid inheritance. Even if they all start with same conceptual time, MyCar.created_at, MyCarV2.created_at and MyCarV3.created_at return different types, despite all supposedly being valid instances of MyCar.
Graph unification forces you to pick a single type for each attribute of a single type. Rather than having 3 types that behave differently, graph unification forces you condense them into one:
class MyCar:
created_at: typing.Union[int, str, datetime.datetime]
That time_since_created function now knows exactly what type created_at is. Nothing else can change the type of created_at. If you need to add another possible type you have to either add it to the typing.Union, or create a new class. You can't create a subclass of MyCar with a different type for created_at.
This is very interesting! Working through the docs now and I'm enjoying the schema, and I've came up with similar ideas regarding data validation / generation in the past. It's nice to find a project like this! Thanks!
In most projects data validation becomes problematic. In a most of cases the schema could be a lot more defined than what type def offers. This allows for test cases to make sure data fits the model.
We've also been creating a DSL to build web apps. Check out Lowdefy [0] - I'm trying to come up with an "Infrastructure as Code" word for Lowdefy. "UI as config" is the closest fit, but not sure...
If you’ve ever had to wrangle yaml configuration files… do yourself a favor and learn Cue. It’s still young and the website can seem intimidating; but it’s simpler than it looks, and the language is unbelievably powerful. There simply isn’t anything else like it. In my opinion it’s in a league of its own compared to other configuration languages like HCL, Jsonnet, Dhall, Starlark etc. Marcel, the creator of Cue, is basically the godfather of configuration languages - most of the state of the art can be traced back to his work at Google. Despite his deep knowledge of the subject and unparalleled experience, he is modest, pragmatic and responsive to questions and feedback. The momentum behind Cue reminds me of Go in its early days.
I’ve been using Cue for over a year now, using it as the foundation for a new projet; and will gladly answer questions about our experience.
I tried dabbling with Cue, but it doesn't seem to solve the problem that I care about, which is that I have a whole bunch of configs that vary only slightly and I want to DRY them up.
For example, for any given application we have several fixed environments--dev, staging, prod--as well as "on demand" environments for things like pull requests or individual developer environments. The configs for these environments are almost the same, but they vary based on a handful of parameters. I want to be able to write a "generic environment" module for each application and then parameterize it accordingly for each environment.
Cue doesn't seem to care much about this problem, but rather it's just trying to make sure your data is type checked. It seems more like an advanced JSONSchema rather than a typed Starlark. I think the latter would be more powerful (albeit Cue's type system is more powerful than an ordinary generic type system with things like range types).
Cue almost has an answer to the DRY problem, but you can't quite emulate functions as far as I can tell (due, I think, to shadowing problems). I wonder what people who are convinced that Cue is the future would say to this? Am I just thinking about the problem wrong?
I'd say all of these problems have answers in CUE.
> I want to be able to write a "generic environment" module for each application and then parameterize it accordingly for each environment.
This is pretty solidly in the target use case range, i'd say - managing variations of the same "object type" over some dimension is a lot of what's targeted by the way that CUE treats directory hierarchies when loading files: https://cuelang.org/docs/concepts/packages/#instances
The main thing you have to consider in designing a layout is that you have to take a compositional approach to how you define individual config instances. That is, you can't start from prod's config, then override a value or two for staging.
If i were to do it - i have not, this is not how i currently use CUE - my first approach would probably be by defining defaults at the "policy" level (per the above link), which effectively allows you to get exactly one "override"-ish behavior.
Lots of possible approaches to this, though.
> but you can't quite emulate functions as far as I can tell
How would you solve this with directory structure and "function structs" respectively? I'm having trouble wrapping my head around the former and ran into shadowing problems with the latter.
> You can set up a generic config with default values for everything, and then have more specific configs that override the defaults.
I was going to say the same thing - this pattern of having an "override" file in add to defaults, is something I've seen in multiple systems and liked. For example, it's used for JSON configuration in .NET, and with Docker Compose's YAML service configuration files.
I had met Marcel van Lohuizen when he was in the board of my previous company. One of the passionate techie and down to earth guy. He was actually working on cuelang and had not released it yet. After he gave us a presentation on Cue, one of my thoughts was that it is not easy for beginners to grasp it but then the language is not meant for beginners. My Second thought which was completely whack, was may be you could use it as a add-on for Protobufs, as the schema definitions in Cuelang has validations builts into it, which might remove boilerplate validation code in grpc services.
I'm one of the contributors. We created a DSL in the language to describe the data and create tests. You can then use that data description to validate against json, csv, avro... One of the neat things we came up with was the concept of a data trace which is like a stack trace but is a path through the data to a particular error.
At this point one might consider using a real language and common software practices for type checking, extending, modularization, testing, etc... Instead of building an ecosystem just to keep Infrastructure as Yaml sane.
My experience with Pulumi and AWS CDK is absolutely brilliant in this regard, hopefully good DevOps/SRE/WhateverNewTerm practices and patterns will reassemble good software development practices in the future.
Cue has a unique lattice type system that allows you to refine a property from type->constraint->value, but does not allow you override an existing value (or change it in any way that conflicts with the existing type/constraints).
In my view, this is the insight and value proposition that sets cue apart from everything else, including general programming languages.
Inheritance + property overriding is the source of most problems in configuration because you can never know if a value is the source of truth.
I think parent means a General Purpose language, i.e. capable of computations.
Personally, on one hand I know allowing computations into configuration immediatly destroys any hope of having a tidy, rational schema in real word projects.
On the other hand though, i do believe configuration and code should be build with related tools, possibly the same tool- or at least tools using the same syntax!
(a bit like the json syntax is the same as a Python dict syntax, except this is the terrible example that is so poorly thought out that does more harm than good)
This unlocks a much greater degree of freedom and power than all the gluing together technologies that we have to do...
If this is the sense in which the comment is intended - CUE is capable of some kinds of computation. It's just not Turing complete.
My 2c - thus far, i've found the language features enabled by this constraint much more useful than the expressiveness lost - at least, for the purposes i've chosen.
It looks quite cool. I think it would be really useful if you have a lot of integrations into different programming languages, frameworks, and maybe even SQL servers.
So you could do data validation on the frontend, backend and the database server based on the same definitions.
It would save us a lot of bugs caused by different opinions of valid data in different layers of software.
Started working on a JS version a few weeks ago [1]. Even with 20% of the features it’s already so useful we’re building systems with it. And not just config - model all the things!
Overrides and inheritance are a world of pain. Unification and commutative operations restore sanity to the actual work of coding with a domain representation language because WYSIWYG. And you get type safety for your domain model.
The project is still at the “Read the Source, Luke” stage so caveat emptor until we get a respectable release out.
Very excited to see someone doing this! Right now, Grafana is [planned to] relying on an anemic CUE->Typescript translator for getting its schema to the frontend - https://github.com/sdboyer/cuetsy. (Somebody also pointed me to Project Cambria recently, which could be an interesting compilation target for what we have https://www.inkandswitch.com/cambria.html)
Being able to work with CUE natively in TS, though, would be a huge gamechanger for what we can do with CUE in Grafana
Our implementation is in TS. However that's just an API. Do you have some ideas on how you'd like the TS type system to work with the Cue type system. It's an open question for us.
Nice, I was looking for a Javascript version. I will check it out. Regarding "Read the Source, Luke" you are in good company with Apple and its Swift ABI :)
Related hobby project - we've been building cueblox (cueblox.com) as a way to create, consume, and validate data in YAML and Markdown using cue. Cue is very powerful, and it's been fun working on this project.
Philosophically they seem very similar, but at a glance[0] it seems like clojure.spec is quite a bit more expressive. Also spec lives in your REPL session and in your source code, while Cue is meant to be used as a CLI, so there is a completely different approach.
CUE is currently centered around its CLI, but AIUI, that's not the long-term goal. i read in some CUE issue somewhere that the goal is shifting towards enabling frameworks rather than driving people to the CLI, though i don't have a link.
Our use of CUE in Grafana is an example of framework-style usage. It is a hard requirement that users never have to install the CUE CLI to perform any of our planned CUE-related tasks; rather, the needed tooling is baked into Go packages we export, and things like grafana-cli. (Avoiding a dependency on the CUE CLI also gives us a defense mechanism against breaking changes)
The Cue integrated Kubernetes project I'm most excited about is KubeVela[0]. Effectively, you can create an "operator" for just the YAML bits to narrow your Kubernetes API and provide best practices via the Components and Trait overrides, and it should allow platform teams to standardize how their teams are deploying software on large Kubernetes installations.
Just dropping this here https://www.w3.org/TR/shacl/ Shacl is a language for defining constraints on data. While it is focused on RDF data, it is also possible to use with JSON/CSV data using an RML mapper (so far only PoC on this side).
This looks interesting, and I’d love to know more, for example what is Cue’s approach to validating sequences?
I’d spend more time trying to answer that question myself but this site wants me to read an awful lot of philosophy before showing me any code. I dislike homepages like this. I’d rather you assume I buy into your philosophy, otherwise I’d leave, so you’re free to just show me what the language is like. When you do that, I’m more interested in examples than EBNF.
Brilliant, ta! I guess I was hoping for something more like linear temporal logic perhaps. What we’ve found with lots of validation libraries is that doing “if this then expect that” kind of rules on sequences is quite difficult.
Is there any plans to support model generation in future as can be done with JSON schema through something like https://github.com/quicktype/quicktype ?
Cue is a project originally started by Marcel van Lohuizen who previously was part of BCL (Borg Config Lang) at Google. The main use is to generate config files.
See the Kubernetes examples at: https://cuelang.org/docs/tutorials/
Here are two posts discussing the motivations for Cue over BCL/Jsonnet:
- https://github.com/cuelang/cue/issues/33#issuecomment-483615...
- https://github.com/cuelang/cue/discussions/669
A very interesting development is that Grafana appears to be adopting Cue as a first-class configuration option. See: "Bring new CUE-based config schema system to release-readiness" https://github.com/grafana/grafana/issues/33139
This could mean that a future where Grafana dashboards can be two-way synced with a git repo will eventually exist.
----
Other tools with some industry adoption in the "Infrastructure as Code" space include
- Dhall
- Jsonnet (from BCL)
- kustomize
- Helm
- kubecfg
- Tanka
- SkyCfg
- jkcfg
- Krane
- HCL (Terraform)
And two tools that fall into a separate class of enabling "Infrastructure as Software"
- Pulumi (TypeScript/Go/Python/.NET)
- CDK