The main risks in executing potentially malicious Dhall code that is not protected by a semantic integrity check are:
* Using more computer resources than you expected (i.e. network/CPU/RAM)
* Unintentional DDos (as you mentioned)
* The malicious import returning a value which changes the behavior of your program
If you protect the import with a semantic integrity check then the malicious import can no longer return an unexpected value, which eliminates the third issue (changing program behavior). Also, upcoming versions will cache imports based on the semantic integrity check, which would mitigate the second issue (DDos) for all but the first time you interpret the program. There is also a `dhall freeze` subcommand which takes a program and automatically pins imports to their most recent value using semantic integrity check.
Regarding exfiltration, the import system guarantees that only local imports can access sensitive information such as file contents or environment variables. See:
The only way that a remote import can obtain that information is if a local import supplied that information via Dhall's support for custom headers. In fact, this is actually an intended use of that feature (i.e. a local import fetching a Dhall expression from a private GitHub repository using an access token retrieved from an environment variable).
So in other words the threat model is that as long as you can trust local imports then you can transitively trust remote imports because they cannot access your local filesystem or environment variables unless you explicitly opt into that via a local import. I think that's a reasonable threat model because if can't trust the contents of your local filesystem then you can't even trust the Dhall interpreter that you are using :)
Imports are not computed and the set of imports that you retrieve is static (i.e. does not change in response to program state or input), so the set of imports or their paths cannot be used as an exfiltration vector.
I had missed that wiki section on safety guarantees, my bad. It seems all of my proposed attacks wouldn't work. There's just one other that I thought of, and looking through your documentation I'm not sure if it would work or not. What if the host executing the script had access to some intranet site with sensitive data? Would I be able to do a network import of such a URL, load it as raw text, and provide that as a header to another import?
Seriously great work on this, by the way. A total configuration language that allows some form of network access while still being secure against malicious input is a really really impressive tool!
> What if the host executing the script had access to some intranet site with sensitive data? Would I be able to do a network import of such a URL, load it as raw text, and provide that as a header to another import?
Yes, a local import would be able to access an intranet site and re-export that information via custom headers supplied to another import. This is allowed because it falls under trusting local imports. Local imports have access to environment variables and your local filesystem, too, which are equally sensitive, which is why they need to be trusted.
This rule is called the "referential transparency" check, which can be summed up as:
* Only environment variables, absolute paths, and home-anchored paths classify as "local" imports
* Only local imports can retrieve other local imports
* URLs can import relative paths, but they are relative to the URL, not relative to your local filesystem
The reason it's called the "referential transparency" check is because this security restriction also leads to the nice property that import system is referentially transparent. That means that every import evaluates to the same result no matter you import it from. For example, if you have a directory of Dhall expressions that refer to each other and you rehost them on a file server the language guarantees that they still behave the same whether you import them locally or you import them via their hosted URLs.
https://github.com/dhall-lang/dhall-lang/wiki/Safety-guarant...
The main risks in executing potentially malicious Dhall code that is not protected by a semantic integrity check are:
* Using more computer resources than you expected (i.e. network/CPU/RAM)
* Unintentional DDos (as you mentioned)
* The malicious import returning a value which changes the behavior of your program
If you protect the import with a semantic integrity check then the malicious import can no longer return an unexpected value, which eliminates the third issue (changing program behavior). Also, upcoming versions will cache imports based on the semantic integrity check, which would mitigate the second issue (DDos) for all but the first time you interpret the program. There is also a `dhall freeze` subcommand which takes a program and automatically pins imports to their most recent value using semantic integrity check.
Regarding exfiltration, the import system guarantees that only local imports can access sensitive information such as file contents or environment variables. See:
https://github.com/dhall-lang/dhall-lang/wiki/Safety-guarant...
The only way that a remote import can obtain that information is if a local import supplied that information via Dhall's support for custom headers. In fact, this is actually an intended use of that feature (i.e. a local import fetching a Dhall expression from a private GitHub repository using an access token retrieved from an environment variable).
So in other words the threat model is that as long as you can trust local imports then you can transitively trust remote imports because they cannot access your local filesystem or environment variables unless you explicitly opt into that via a local import. I think that's a reasonable threat model because if can't trust the contents of your local filesystem then you can't even trust the Dhall interpreter that you are using :)
Imports are not computed and the set of imports that you retrieve is static (i.e. does not change in response to program state or input), so the set of imports or their paths cannot be used as an exfiltration vector.