Dhall: a programmable configuration language that is not Turing-complete

sly010 · on Sept 6, 2017

Every configuration documentation is full of wagely defined or undefined combination of options and "this-flag-cannot-be-used-together-with-that-flag" clauses. Often these things are not even documented but are implicit.

e.g. in systemd service definitions depending on your service "Type" the meaning and the validity of the other properties change. (e.g. Type=oneshot cannot have ExecStart)

Statically typed languages allow you to create very elegant configuration DSLs so the user cannot create invalid programs (where invalid can mean whatever the DSL creator wants). But most software is written to read shitty (untyped) configuration languages.

I always wished config files were not a soup of magic keywords but typed datastructures (talking to you, every yaml configuration file ever). It looks like Dhall is just a statically typed and functional language that allows you to create a nice typed DSLs to generate always valid config files. It's at least worth a closer look.

touisteur · on Sept 8, 2017

Talking about statically-typed languages and configuration paramètres, something coming from the Ada/SparkAda world is the use of type predicates and invariants, in addition to expressive static typing.

Here's an interesting paper on the subject : http://acg-solutions.fr/acg/wp-content/uploads/2017/04/PDI-5...

And here's some demo code : https://github.com/AdaCore/SPARK_PDI_Demo/blob/master/README...

The other interesting part is that you can then prove (or use static analysis to get à high level of confidence) your application will never be run with incoherent parameters. Another nice thing is that with the help of the prover you can let the internal constraints of your implementation surface up to the application parameters. Sa y you didn't handle some corner case, you can either implement it, or use a predicate to ensure this case is never run. Your pre-conditions and the type invariants/predicates of your parameters are part of your interface.

man-and-laptop · on Sept 6, 2017

Subtraction:

   let pred = λ(n : Natural) → (Natural/fold n { prev : Natural, next : Natural } (λ(p : { prev : Natural, next : Natural }) → { prev : p.next, next : p.next + +1}) { prev : 0, next : 0 }).prev 
   in λ(x : Natural) → λ(y : Natural) → Natural/fold y Natural pred x

The Ackermann function:

   let iter = λ(f : Natural → Natural) → λ(n : Natural) → Natural/fold n Natural f (f (1)) 
   in λ(m : Natural) → Natural/fold m (Natural → Natural) iter (+ +1)

Save the above to 'ack' and run the following: $ dhall <<< './ack 10 10' It's going to take a while before it finishes.

This language is close to Turing complete.

ameliaquining · on Sept 6, 2017

Wait, so you can write programs in this thing that aren't primitive recursive? That's quite interesting; requiring primitive recursiveness is the usual way to guarantee termination. What class of programs does this language allow you to write? (It of course can't allow all programs that terminate, unless it also allows some that don't, which it claims not to.)

evincarofautumn · on Sept 7, 2017

The type system appears to be System Fω[1], the “higher-order polymorphic lambda calculus”. I’m not really familiar with what class of programs it accepts, other than “more than simply typed lambda calculus”, but typechecking is decidable and evaluation is strongly normalising (i.e. both always terminate).

[1]: https://en.wikipedia.org/wiki/System_F#System_F.CF.89

chombier · on Sept 7, 2017

Can somebody comment on what is missing for System F(ω) to be Turing complete?

I recall that it can typecheck self-application, so is it not possible to define and use the fixed-point combinator?

Gabriel439 · on Sept 7, 2017

System Fw cannot type-check self application

The type system is what prevents System Fw from being Turing complete. Specifically, the key bit is that you cannot unify the type "a -> b" with "a" (which is what prevents self application)

evincarofautumn · on Sept 7, 2017

System F can check self-application with an explicit type annotation, something like λx:(∀a. a → a). x [∀a. a → a] x, but Fω can’t.

chombier · on Sept 15, 2017

I thought Fω was an extension of System F with type operators, so I figured if System F can check self-application, System Fω can too.

Am I missing something else?

evincarofautumn · on Sept 16, 2017

Sorry, that comment was worded badly. Both System F and System Fω can express self-application at a particular type, but they can’t give a general type for self-application, e.g., Ω = ω ω where ω = λx.xx.

chombier · on Sept 24, 2017

I see, thanks for the precision!

imtringued · on Sept 6, 2017

It's only guaranteed to terminate. That doesn't mean it's guaranteed to terminate before the heat death of the universe.

networked · on Sept 6, 2017

Can you limit evaluation to N reductions?

Gabriel439 · on Sept 6, 2017

Author here: you could in principle, but in practice neither the library nor the executable compiler provide out-of-the-box support for limiting the number of reductions

Dhall's basic approach to safety is:

* make it impossible to intentionally do something malicious (such as destroying something important)

* make it hard to unintentionally do something incorrect or inconvenient (such as running for a long time)

For example, you can intentionally write a program that takes a very long time to run if you want to, but you will almost certainly know you are doing so; it's hard to write such a program by accident. In general, programming in a non-Turing-complete language forces you to structure code in such a way that potentially long-running code is more "obvious" than in Turing-complete languages.

man-and-laptop · on Sept 7, 2017

Perhaps I should've added a :P to the end of that. See Gabriel's response here: https://news.ycombinator.com/item?id=15187733

danidiaz · on Sept 6, 2017

For what is worth, Dhall lets you check if an expression is already in normal form: http://hackage.haskell.org/package/dhall-1.6.0/docs/Dhall-Co... so one could use Dhall as a "data-only" language without bothering with evaluation.

lomnakkus · on Sept 7, 2017

Yes? Do you foresee this being a typical thing to write in configuration files?

If we're being pedantic, as you were in your comment, "close to Turing complete" is not really a very precise term, is it?

When you have an issue, at least be explicit about it rather than just snarky. (See also Gabriel's other comments in this thread; not that I know or have any affiliation with him.)

JackC · on Sept 6, 2017

Maybe I'm just not the target audience, but I found this intriguing yet confusing. It might benefit from an explanation of typical use cases -- what do people do now that Dhall would be a better tool for?

I read the tagline: "Dhall is a programmable configuration language that is not Turing-complete," so I figured "programmable configuration language" was a term I just wasn't familiar with, and it might bring up some typical use cases if I searched for it -- but it seems like that term was invented for Dhall.

So maybe the kind of example I'm looking for is: if you're now using a non-programmable configuration language, here's what that would look like, and here's the improvement if you switch to Dhall. Or if you're now using a programmable configuration language that is Turing complete, here's what that would look like, and here's the improvement if you switch to Dhall.

(A more specific question: in what kind of scenarios is JSON+arbitrary terminating functions useful?)

Gabriel439 · on Sept 6, 2017

Author here: there is a reason why "programmable configuration language" is a fairly new term. Most of the time, when people want a programmable configuration they use whatever language they are already programming in. So, for example, if you are writing a program in Scala you might write the program's configuration in Scala, too. For example, this is how Scala's SBT works: the SBT configuration is written in Scala

However, there are some down-sides with writing your configuration within a language like Scala (or any other mainstream language). For example, your program "configuration" can hang, crash, throw exceptions, or destroy your system if it is running with sufficiently elevated privileges. Dhall has none of those problems.

Similarly, you can go overboard and use excessive abstraction in other languages, which is inappropriate for a configuration file (which needs to be easy to read and understand). Dhall also avoids this issue by supporting automatic simplification of configuration files to remove abstraction, imports, and indirection (which is made possible by the fact that Dhall is not Turing complete)

I'll try to rework the documentation to answer this sort of question better and also incorporate some of the other excellent replies to your question

KirinDave · on Sept 6, 2017

I use Dhall extensively now. In fact, all my typescript programs for my current employer use JSON configuration that Dhall generates (and I've tried so many times to get this to frontpage on YC, I guess I'm just not good at timing).

Dhall's draw as a configuration language is that it allows you to express logic in your configuration and even call out to remote services and file handles safely. People have a gut negative reaction to this phrase even as they essentially encode non-trivial logic directly into their Chef, CircleCI, Docker-Compose and Kubernetes configs, so it's hardly a new idea.

Dhall lets you write logic into your configs that otherwise would be implicit. As an example, you might define a new configuration environment with Dhall and use the types to make sure you NEVER provide production credentials, or even shared credentials.

Dhall also allows you to safely call out to web services (with authentication and SSL, of course) to fill in holes or even provide non-trivial logic (it's perfectly valid to move functions around in dhall over the wire). I use this feature to map a tiny microservice that resolves Hashicorp Vault calls to AWS credentials just-in-time as I ship, and I feel very confident that it's safe because I can encode the sorts of permissions and their resolution directly into the types I'm handling.

You could start using Dhall instead of JSON today for configs, and the main advantage is that you can assert values have specific types (and also provide sane loops over list fields).

Dhall is actually a pretty amazing invention and I keep trying to work up the guts and time to write a typescript interpreter for it.

kobeya · on Sept 6, 2017

You've also failed to link to it here.

nathancahill · on Sept 6, 2017

Interesting, can you share the link you were submitting? I looked through your submissions but didn't see it.

kreetx · on Sept 6, 2017

I agree with you there, but those of you already reading this, here's what it means (what I think it means anyway ;-):

Dhall is like a templating language, but statically typed and will always terminate. Typing is good because your configuration will always have correct structure -- the language itself allows you to verify this; and non-determination is good because you don't have to worry about your program accidentally not terminating.

JSON, YAML, .ini, etc etc are non-programmable configuration languages, and, I guess, m4 would be an example of a Touring complete one.

andolanra · on Sept 6, 2017

For starters, consider any kind of configuration that might get really repetitive. Say you're running a web server that's serving several subdomains, but almost all of them are serving static files from a pretty standard directory hierarchy. You could write out each server configuration individually, like

  { "servers":
    [ { "domain": "foo.example.com", "root": "/srv/example/foo", "port": 80, "gzip": true, ...},
      { "domain": "bar.example.com", "root": "/srv/example/bar", "port": 80, "gzip": true, ...},
      { "domain": "baz.mydomain.com", "root": "/srv/mydomain/baz", "port": 80, "gzip": true, ...},
    ]
  }

but in a proper programming language, that'd be kind of a code smell: it's repetitive, a bunch of fields are shared between each entry, &c &c. Well, a configuration language that adds functions would let you abstract over that repetition. (This isn't Dhall syntax, because I don't know Dhall, but rather a kind of strawman syntax.)

  let typical_server(domain, subdomain) =
    { "domain": concat(subdomain, ".", domain, ".com"),
      "root": concat("/srv/", domain, "/", subdomain),
      "port" 80,
      "gzip": true,
      ...
    }
  in { "servers": [ typical_server("example", "foo"),
                    typical_server("example", "bar"),
                    typical_server("mydomain", "baz"),
                    ...
                  ] }

More than that, though, it would also let you include values in your configuration fields that are themselves functions. For example, say I'm writing an email client, and I want users to be able to define labels that allow automatic sorting of emails based on various criteria: does it contain this substring, is it from this address, does it have an attachment, &c.

  { "labels": [ { "name": "from_joe",
                  "criteria": { "email_address": "joe@example.com" }
              ] }

But this only works with simple tests. Let's say a user wants to apply a label if an email is from this address or from that address or contains certain text in the body as well as a particular attachment: I would have to add all kinds of complicated edge cases to this JSON configuration, or just throw my hands up and say that the user can't do that.

But with JSON + arbitrary terminating functions, I can have my cake and eat it too, because now I can include functions as values in my configuration. A user could write something like (although again, pseudocode and not Dhall):

  { "labels": [ { "name": "from_joe",
                  "criteria": fun(email) => email.addr == "joe@example.com"
                                || email.addr == "joseph@example.com"
                                || (email.body.contains("joseph") && is_key(email.attachments[0]) }
               ] }

That doesn't mean you'd necessarily want this for every kind of configuration, of course! But features like this—the ability to use program-like abstraction in your configuration files as well as the ability to express arbitrary terminating functions are both interesting features that make a language like this worth considering.

benley · on Sept 7, 2017

What you're describing here is remarkably similar to Jsonnet in syntax and style. Jsonnet is quite nice, imho!

http://jsonnet.org/

https://github.com/google/jsonnet

TeMPOraL · on Sept 7, 2017

It's less about "programmable", it's more about "Turing-complete". Why this is important - see my other comment here for details: https://news.ycombinator.com/item?id=15190110. TL;DR: langsec.

nathancahill · on Sept 6, 2017

More commonly called a DSL. See: Varnish's configuration language, which is also not Turing-complete: https://varnish-cache.org/docs/3.0/tutorial/vcl.html

azeirah · on Sept 6, 2017

I understand that Turing completeness is not always a desirable quality for your programming language of choice, but since it is a rather obscure thing for those of us who haven't studied CS, when is a non-turing-complete language a favorable choice compared to a more traditional turing complete one?

Gabriel439 · on Sept 6, 2017

Here's a practical example of why it's beneficial to not be Turing complete. In Dhall, you can normalize programs, even if they are functions. For example, the interpreter can automatically simplify this Dhall function:

        let replicate = https://ipfs.io/ipfs/QmQ8w5PLcsNz56dMvRtq54vbuPe9cNnCCUXAQp6xLc6Ccx/Prelude/List/replicate

    in  let exclaim = λ(t : Text) → t ++ "!"

    in  λ(x : Text) → replicate +3 Text (exclaim x)

... to this one:

    λ(x : Text) → [x ++ "!", x ++ "!", x ++ "!"]

... even though we haven't applied the function to any arguments yet. You can't perform this sort of simplification (in general) if the language is Turing-complete

This sort of automatic simplification comes in handy a lot when authoring configuration files. For example, if somebody objects to the import of the remote `replicate` function, you can just simplify the file and (voila!) all the imports are gone because they've all been inlined and reduced. Similarly, if somebody objects to excessive use of abstraction and functions you can similarly simplify them to remove all indirection

skybrian · on Sept 7, 2017

This is inlining a function and unrolling a loop. It's done in many languages, no?

More generally, many automatic refactorings are not entirely safe but we do them anyway, assuming that the code will terminate (since intentional infinite loops are rare) and relying on tests and human reviewers to check for mistakes.

Depending on the program, substituting equals for equals might have to be rolled back for performance reasons; computing a mathematically equal value is no guarantee.

Karrot_Kream · on Sept 7, 2017

> assuming that the code will terminate (since intentional infinite loops are rare) and relying on tests and human reviewers to check for mistakes.

This is a pretty labor-intensive, mechanical guarantee. If I had to run config changes through code review (like SBT configs, for example), then that slows down velocity considerably. Having a language for your config offer these guarantees means that any committer can safely make edits, without having to go through process.

Gabriel439 · on Sept 7, 2017

In a Turing complete language you can't safely inline all functions to completion without risking an infinite loop. In such a language there is no decidable way to know when to stop inlining things

skybrian · on Sept 7, 2017

It doesn't matter in practice, because we don't need to inline every function.

Gabriel439 · on Sept 7, 2017

The key word in "automatic simplification" is "automatic". The feature loses value if a human has to intervene to specify which functions to inline or to continue inlining. Imagine how worthless `go fmt` would be if it prompted the user to confirm every change to the source code

qznc · on Sept 7, 2017

Compilers like LLVM and GCC use heuristics not human intervention. For inlining a common heuristic is the size of the function. So we inline (even recursive functions) until the function becomes larger than a certain threshold. The threshold can be specified by a human (-finline-limit), but I believe that is rarely done.

chriswarbo · on Sept 7, 2017

This isn't compiling though; it's normalising. A heuristic like function size is useful when optimising a Turing-complete language since (a) we have to rely on some heuristic and (b) smaller binaries are generally more efficient, all else being equal, so we should avoid a size blow up regardless of what we're optimising for (speed, size, memory, etc.).

In the case of normalising a non-Turing-complete language, we (a) don't need any heuristics, beta-reduction is a complete and correct strategy and (b) things like the size of a function are useless at telling us whether we've reached a normal form. In fact, I would imagine that normal forms of real Dhall programs are generally much bigger than the programs themselves, since one of the main reasons to use a language like Dhall is to reduce repetition. Also, your heuristic is heavily dependent on the evaluation order: if we have a program like this:

    (\x -> (\y -> x)) small-thing (duplicate 1000000 big-thing)

Then an evaluation strategy like call-by-name will never look at big-thing, since it evaluates the functions first and they discard it:

    (\x -> (\y -> x)) small-thing (duplicate 1000000 big-thing)

    (\y -> small-thing) (duplicate 1000000 big-thing)

    small-thing

    small-value

On the other hand, an evaluation strategy like call-by-value will evaluate big-thing, resulting in some arbitrarily large value (which may cause your heuristic to halt); then it will create 1000000 duplicates of that value (again, causing a size-based heuristic to halt); then finally it will evaluate the functions and discard the big, duplicate expression:

    (\x -> (\y -> x)) small-thing (duplicate 1000000 big-thing)

    (\x -> (\y -> x)) small-value (duplicate 1000000 big-thing)

    (\x -> (\y -> x)) small-value (duplicate 1000000 big-value)

    (\x -> (\y -> x)) small-value [big-value, big-value, ...]

    (\y -> small-value) [big-value, big-value, ...]

    small-value

chriswarbo · on Sept 7, 2017

You do if you want to reach a normal form. Note that the author isn't talking about compiling or optimising, they're talking about normalising. I don't know of a Turing-complete language where we can ask to normalise some arbitrary expression including going 'under a lambda' (i.e. performing partial application, etc. inside function invocations; what you refer to as 'inlining'). The closest I can think of is expanding Lisp macros, but they're already a "compile time" transformation so I'm not sure how comparable they are (although of course Lisp can interleave compilation with evaluation!)

skybrian · on Sept 7, 2017

The place where using a non-Turing complete language actually matters is in constructive proofs where it's common to write a function without running it. This can be used to show that a value can be constructed in principle, even if it's not a practical program because it would take billions of years.

But practical programming is about writing programs that we actually want to run. The distinction between "takes far too long" and "hangs forever" is unimportant because real-world tasks have at least an informal deadline: how long the user is willing to wait. And most performance testing is done by actually running programs, not via static analysis alone, because we want to know how fast the code is on real-world hardware.

This particular language is something you'd use to generate large, repetitive configurations. It makes sense for that use case that you'd want to make sure all macros can be expanded. But you don't have to prove this statically, because you can actually run the program and look at what it generates. Doing config file generation using a Turing-complete language would also work fine; if you accidentally create an infinite loop (or just very slow code), you can hit control-C and fix the bug.

Gabriel439 · on Sept 7, 2017

For the purposes of evaluation, it's still an improvement to have a total language. As I mentioned in another thread, using a totality checker is like wearing seatbelts: it doesn't protect against everything, but it's still a huge improvement.

mmirate · on Sept 7, 2017

Isn't this a result of referential transparency, not Turing-incompleteness?

Gabriel439 · on Sept 7, 2017

Both. In a Turing complete language there are programs without normal forms. For example, in the untyped lambda calculus you will loop forever if you try to normalize the following expression:

    (\x -> x x) (\x -> x x)

chowells · on Sept 6, 2017

You would always prefer to not have Turing completeness in your language, all other things being equal. Turing completeness isn't ever necessary to write the logic you want to write.

But all other things are not equal. There is a cost to giving up Turing completeness. The language must now ensure additional properties about the code, mostly related to termination and productivity. The two strategies for this are reducing functionality offered by the language until you can prove all programs are good and introducing tools into the language to write compiler-verified proofs about your code.

Dhall is an example of the former approach. It removes enough pieces that it can only express computations that terminate.

Language like Idris, Agda, and Coq are examples of the latter. They provide rich systems that allow you to express proofs of various properties as needed to avoid Turing completeness.

Both categories of language are currently difficult to use for general purposes software. Turing completeness is a sacrifice made to ease writing software at the cost of making certain bugs possible.

At the moment, the best times to use a non-Turing complete language are when you have a problem domain where you can be satisfied with a reduced power language, or when you want to go all the way in the opposite direction and write machine-checked proofs about your software.

As we study these things more, we find ways of easing the burden of using languages that aren't Turing complete. Maybe in the far future, systems will be good enough that most software can comfortably be written in languages that aren't Turing complete. It's a worthy aspiration.

It's not now, though.

TeMPOraL · on Sept 7, 2017

Security.

A lot of security vulnerabilities in programs essentially boil down to your code being an ad-hoc parser for an (often implicit) language that's more powerful than you wanted. Turing-completeness is especially dangerous, and you want to avoid it if you don't actually need it.

This is a very insightful way of looking at things, and is a research area called langsec.

See:

- http://langsec.org/

- Papers and public talks of Meredith L. Patterson

- https://www.gwern.net/Turing-complete for some examples of unexpected Turing-completeness

chriswarbo · on Sept 7, 2017

> Turing-completeness is especially dangerous, and you want to avoid it if you don't actually need it.

Can I ask why Turing-completeness is "especially dangerous"?

Let's say we have a pure language for e.g. generating a JSON value (nested lists/maps of strings, floats, nulls and booleans). It has no primitives other than null, true/false, string, float, list, map and function literals, and corresponding "arithmetic" functions for each type.

What would the safety and security differences be between such a language being Turing-complete and not being Turing-complete?

Programs in these languages can't "do" anything, other than calculate. The only problem I can think of is making denial-of-service easier, by e.g. generating an infinitely large list and running out of memory:

    (\x -> [ null ] ++ (x x)) (\x -> [ null ] ++ (x x))

This will generate `[ null, null, null, ... ]` until it runs out of memory. However, we could still blow the memory without Turing-completeness, e.g. by growing an exponentially large datastructure for a large number of steps (basically like a zip bomb):

    (\f xs -> case xs of
                   nil         -> [ null ]
                   (cons y ys) -> (f f ys) ++ (f f ys))
    (\f xs -> case xs of
                   nil         -> [ null ]
                   (cons y ys) -> (f f ys) ++ (f f ys))
    [ null, null, null, null, null, ...and so on 1000 times ]

This recursion is well-founded, and structurally-decreasing since we only call `f` with a strictly-decreasing value of `xs`, until we hit the base-case for `xs = nil` where we return `[ null ]`. In the non-base-case we recurse twice to generate two identical lists, then append them together. This results in a list with 2^1000 elements, easily consuming all available memory.

Hence for security, we need to kill programs which go over some predefined time/memory bounds regardless of whether or not they're Turing-complete. Hence my question about whether or not Turing-completeness is "especially dangerous", in comparison to e.g. I/O primitives.

Gabriel439 · on Sept 7, 2017

The latter example you gave will still not type-check in Dhall, although I'll still answer the spirit of your question

You can write expressions in Dhall that will take a long time to evaluate (see the Ackerman example elsewhere in this thread), but it's harder to do so by accident. Totality checking is worthwhile for the same reason that seatbelts are worthwhile: they don't protect against all accidents, but they do protect against a lot of them

chriswarbo · on Sept 8, 2017

> Totality checking is worthwhile for the same reason that seatbelts are worthwhile: they don't protect against all accidents, but they do protect against a lot of them

Don't get me wrong, I'm all in favour of totality! My question was mostly about the "especially dangerous" wording. In my opinion, the major advantage of using Dhall for config instead of say, Scheme, is that it's pure/referentially-transparent/has-no-IO/etc., which everyone seems to be ignoring in favour of debating whether or not totality is desirable

(Of course, Dhall does have "I" (but not "O") in the form of imports, but these are "morally pure" as long as URL contents are stable, similar to Nix's imports being "morally pure" as long as builds are deterministic).

> The latter example you gave will still not type-check in Dhall

Incidentally, how does Dhall check for totality? I know it's based on the calculus of constructions; does it use syntactic guardedness checks like Coq?

Whilst I like totality, I'm still undecided about whether it's worth enforcing in general-purpose languages (e.g. Agda, rather than Dhall), since totality checking is still quite restrictive. The usability balance for static typing was crossed by Hindley-Milner, but I'm yet to see such an unobtrusive approach to totality checking (whenever my Coq or Agda projects become more than toys, they tend to get wrapped up in a monadic `Delay` to avoid fighting things like syntactic guardedness!)

Gabriel439 · on Sept 8, 2017

Dhall does the dumbest thing possible to enforce totality: Dhall doesn't support recursion! Note that Dhall does support lists and folds on lists (which are total), but not user-defined recursion

For example, if you write:

    let x = x in x

... then the second `x` will be flagged as an unbound variable because `let`-bound variables are not in scope for their own definitions. Similarly, there is no `let rec` in Dhall

That implies that if you want a user-defined recursive type that was not a list then you have to explicitly Boehm-Berarducci-encode the recursive type and its inhabitants. For example, if you wanted to encode the following Haskell data type in your configuration:

    data Tree = Node Integer | Branch Tree Tree

    example :: Tree
    example = Branch (Node 1) (Branch (Node 2) (Node 3))

... you would encode it like this:

        λ(Tree : Type)
    →   λ(Branch : Tree → Tree → Tree)
    →   λ(Node : Integer → Tree)
    →   Branch (Node 1) (Branch (Node 2) (Node 3))

You can think of Boehm-Berarducci encoding as "anonymous recursion" when you write it this way because every inhabitant of such a "recursive" type includes the datatype definition in the value (if you view the lambda-bound type and constructors as the datatype definition)

Or to put it another way, if you ignore Dhall's support for builtin types, builtin functions, and imports then Dhall is literally the exact same thing as System Fw . You're used to thinking of System Fw or the calculus of constructions as underlying core languages for a higher-level language semantically defined in terms of the core language like Coq, Agda or Haskell. However, in Dhall there is no underlying core language: you are assembling raw System Fw expressions. Since the most pedantic definition of System Fw requires explicit polymorphism and doesn't support recursion then Dhall behaves the same way, too.

I agree, though, that the lack of (unsafe) IO is an even bigger deal. Like you mentioned, if you use immutable URLS (such as IPFS) then they are basically pure. Even if you do use mutable URLs, you can always use the ability to normalize programs to purify them of imports (and you can also use Dhall's API to safely and statically verify that an expression is devoid of imports).

qznc · on Sept 7, 2017

Other popular non-Turing-complete languages are TeX and SQL.

In practice, the fact that the program cannot hang is probably the biggest point. Every programmer has experienced bugs where the program runs into an infinite loop or deadlocks. Those are the nastier bugs. Removing Turing-completeness removes that whole class of bugs, similar to how Rust removes the whole class of data race bugs.

RodgerTheGreat · on Sept 6, 2017

In layman's terms, it is impossible to determine certain properties of an arbitrary program written in a Turing-complete language. Properties like "will this program finish within a reasonable amount of time?" or "does program X do the same thing as program Y?"

For a non-turing-complete language, it may be possible to test such properties.

It's crucial to understand that this distinction is not one of how practical it is to determine properties; once you open the door to Turing-completeness, all ability to reason about a program effectively flies out the window. You might be able to come up with rules-of-thumb that determine an interesting property of a program in a Turing-complete language some of the time, but there will always exist programs for which your rules-of-thumb are insufficient.

dozzie · on Sept 6, 2017

For configuration you very much do not want Turing-complete language -- or any programming language, for that matter. Configuration quite often is processed by more than just the original program it was intended for, so plain data dump is the easiest form from this angle. Also, it's much easier for human operator to just read the values than to interpret a de facto script that produces values.

There were several attempts at using a programming language the tool was written in for storing its configuration. None worked well enough to catch up as a general trend, and not because the languages used were Turing-complete.

My opinion is that a programmer has seen for a short moment what sysadmins do and decided to make something to help in the task, without really understanding what's needed and what works, what doesn't work, and why.

lmkg · on Sept 6, 2017

If your language is not Turing Complete, then it might be possible to always prove certain properties of programs. Canonically, you might be able to prove that the program will always complete and exit in finite time on any input. Other properties include guaranteeing that an error will not be thrown (or will be thrown), or bounding the amount of time or memory consumed, or guaranteeing that the output of a function will be within a certain numeric range.

tincholio · on Sept 6, 2017

For instance, if you want to make guarantees about program termination (as the author does here). This is not possible for Turing complete languages.

qznc · on Sept 7, 2017

Interesting. I’m pondering non-turing-complete languages myself. I believe they are under-estimated and should be used more often for anything that looks like configuration.

Configurations usually start simple, like we only need “x = y” statements. Oh wait, sections would be nice, so use the ini-format. Oh wait, nesting stuff would be nice, so use JSON/XML/Lisp. Oh wait, we want to reduce duplicated stuff, so use a preprocessor. Oh wait, more abstractions would be nice, so use some scripting language already. Stop, we just skipped one level before scripting languages, the non-turing-complete languages!

Concerning Dhall, I never met the “Oh wait, we want to annotate types” idea. Is that really desireable for configurations?

jopsen · on Sept 6, 2017

We recently did a similar thing, phrased as json-expressions, ie. a json structure that evals to another...

https://taskcluster.github.io/json-e/

Karrot_Kream · on Sept 7, 2017

I see you allude to Terraform, but will you ever write a dhall specific compiler for Terraform configs?

Gabriel439 · on Sept 7, 2017

Yes, however standardizing the language is higher priority at the moment

thesmallestcat · on Sept 6, 2017

What they're attempting here is interesting, but for configuration, I like a well-known real language, or a well-known, simple configuration language (ini, shell/env vars, JSON...). This to me is an awkward middle ground.

_ibu9 · on Sept 6, 2017

The whole point is if this (or something similar) was well known then you wouldn't have to use shitty typeless stuff like JSON...

skybrian · on Sept 7, 2017

It's not typeless if you validate it using a schema. (For example, there are JSON encodings of protobufs, where validation will be done when the file is read.)

kronos29296 · on Sept 6, 2017

Totally off topic but Liked the name and the reference to Planescape Torment. (My favourite RPG)

icen · on Sept 7, 2017

The author has a bunch of similarly named projects (Morte, Annah).