More

simonrepp · on Aug 18, 2019

tl;dr: Another alternative to YAML (among many great others), this one designed and developed by me:

I've been doing a lot of research and development on language design for file-based content (e.g. for static site generators). I've found that YAML - although established as the go-to format for statically generated blogs, etc. - was never designed for these things as it by its nature does not support simple, essential features for this usecase like for instance unindented blocks of verbatim text (for which YAML frontmatter was invented as a very limited hack).

The result of all this R&D is a language called "eno notation" which is designed especially for file-based content usecases, and around which I've also built an entire ecosystem for many languages and editors - if you're working in that field, it might be worth taking a look!

roryokane · on Aug 19, 2019

I find it surprising that your format doesn’t distinguish strings and numbers, or other types of scalar values in general. For example, in your demo “eno's javascript benchmark suite data” on https://eno-lang.org/eno/demos/, both of these lines:

  iterations: 100000
  evaluated: Fri Jul 06 2018 09:46:48 GMT+0200 (Central European Summer Time)

are tagged below as just a “Field”. Do client programs that read an Eno file need to run `int()`/`float()` or `.to_i`/`.to_f` on the field values they know should be numbers? That seems unergonomic.

simonrepp · on Aug 19, 2019

You are correct! The thinking behind this is that for the majority of file-based configuration and content usecases the expected types are fixed and known beforehand already - ergo it makes more sense that a developer has to specify once which type a field is (gaining in return 100% type safety, validation, localized validation messages, ...) than all users later having to e.g. explicitly write quotes a million times when writing configuration/content, just to tell the application something about the type it already knows anyway (and wouldn't expect/accept any other way too). I think this is really more ergonomic, even in the short run.

simonrepp · on Dec 6, 2018

Awesome, very happy to see this.

Now I'm curious - are you putting this to use somewhere already? Or is it for the time being "just" an experiment for the joy of implementing in itself? :)

Thanks for sharing!

bef · on Dec 7, 2018

It's in use by in-house software still in development as a prototype/proof-of-concept configuration language.

simonrepp · on Dec 7, 2018

Very cool! If it succeeds and you'd like to publish any part of it at some point, I'll gladly pick it up for a collection of case studies on eno-lang.org, just let me know then.

Meanwhile, I've added a commmunity projects section at https://eno-lang.org/libraries to raise awareness of your Tcl implementation there as well.

Thanks again, looking forward to further developments!

simonrepp · on Dec 5, 2018

The design principle for types in eno is that the language itself has no notion of types, there are only plain textual representations. Meaning, types are never inferred from a document. Instead, when parsing an eno document the application explicitly requests the type it expects for each field, thereby validating the document during parsing. From the perspective of the libraries for parsing eno, types are defined as simple functions/closures (I call them "loaders"), that basically take the string value from a field and validate/transform and return it as the proper type. Some of these loaders currently ship with the libraries for convenience (int, float, datetime, etc. - this is not a hard specification though, just a sensible predefined toolbox), but there can be any number of types, whichever types an application needs in addition can be defined and used on the fly.

This is the main point that differentiates eno from almost all TOML/YAML/JSON like formats, and the implication of this is that eno is extremly simple and very accessible for non-technical users (so far all casestudies have verified this to be true) and a highly reliable data source for developers, but it also comes at the cost of losing generic de/serializability, so it depends on the usecase whether eno should be chosen over other formats. :)

There are also some other things that are unique to eno (e.g. all parser and validation error messages are fully localized and in user-friendly humanized language) or painfully missing in some other formats (e.g. having any number of "multiline string" fields that don't hang indented against an invisible left margin somewhere in the document), feel free to explore further on the website if you're curious!

Thanks for your interest and the feedback on the ABNF (much appreciated!), hope I could clarify things a bit. :)

simonrepp · on Aug 18, 2018

On npm that's in fact already covered, scrutinize the list https://www.npmjs.com/search?q=eno :)

simonrepp · on Aug 18, 2018

That road (C or Rust parsing core through bindings) will likely be taken, but for the initial development and jump-starting the ecosystem it was important for me to start with implementations that can be quickly experimented with and iterated and not spend a lot of extra time on dealing with segfaults, memory leaks, the different binding mechanisms on different platforms, etc. As things stand now, people are provided with multiple, fully functioning, pure implementations that already are faster than the majority of YAML/TOML parsers. In the coming months and years there will be plenty of time to make things even faster. :)

For me caring about performance also means caring about performance on all platforms, why not after all? You can take the tabular benchmark data I provide and paste it together, or use the raw data that is also available as eno files in the repository to compare language against language too (which I initially also did but later dropped because same-language comparison for libraries made more sense to me), if you want the quick run down as far as I remember it: mostly javascript parsers lead the ranking, ruby parsers are a bit behind and just slightly ahead of Python.

simonrepp · on Aug 18, 2018

Associated fifties right away too, although not that colorful :)

simonrepp · on Aug 17, 2018

Well spotted, good question!

It's also been asked in another thread on HN, I'm quoting myself here: "eno has neither indentation nor closing tags of any sort, that means if you use a section to group some values, you need to start another section to end the previous one (no closing tags!), that's why there are fieldsets, which allow short groupings that automatically end with the next field/list/fieldset."

Hope this explains it :)

simonrepp · on Aug 17, 2018

Yes through sections! See https://eno-lang.org/introduction/.

You can nest as deeply as you want and multiple sections on the same level automatically turn into a list of sections. For just a list of flat dictionaries you can also use fieldsets, see https://eno-lang.org/advanced/. :)

simonrepp · on Aug 17, 2018

ruamel.yaml in python does that to a certain degree from what I've read, you might want to check it out if yaml is ok for your usecase! (https://yaml.readthedocs.io/en/latest/)

I've given this some thought as well, and given that the eno libraries hold their own representation of data in memory this might actually be plausible to implement in some way. Still I fear this will turn out to be a hard, hard problem (as eno is not even generically serializable by design), so that's why I haven't explored it further. So for the moment I can only say - Maybe in the near future sometime, check back every once in a while! :)

simonrepp · on Aug 17, 2018

Sure!

Cultural research = notoriously underfunded, so although they have and rely on a relational database that holds their data (previously Postgres) the cost and effort associated with maintaining and extending the system is pretty high.

With the new setup the thousands of eno files represent both the place of storage and the interface to edit the data, so by that we eliminated the development effort to provide and maintain a full web frontend to the database, and the effort to just maintain the actively deployed technology somewhere and keep it at least patched for security reasons.

All that remains now technology wise is an Atom plugin that is locally installed on each client at the institute and takes care of validating, provides relational autocomplete helpers as demonstrated in [4] and offers a few hooks to kick off local builds for multiple deployment targets and deploy them to live as well.

Hope this clarifies things! :)