The primary intended use case of this in contrast to say Extensible Data Notation (EDN) seems to be for faster machine processing. The necessitating of prefixing datums with their lengths (Pascal string style) alone is the clue. So an advantage here is it is much easier to place a hard bound on memory and CPU for reading this format, and confers security properties like systemically reduced possibility of buffer overflows. Good for hard real-time, for example guidance systems that do all allocation only at startup.
Beyond lists and string atoms (or whatever the actual list is), this format also makes an affordance for custom types, but as TFA points out, you still have to roll your own other / higher order data types. Data types that you almost definitely have on hand. Now we are talking about needing to do additional processing on the decoded output, just to interpret common data structures like associative arrays and sets. And as a machine-first serialization format, if you are interchanging with other people or with yourself in the future, sure hope you have full agreement on those custom types.
So what do you do: Add libs? Roll your own? Well, competing alternatives already offer that complete picture as mature, battle-tested solutions. So I'm inclined to view Canonical S-Expressions merely as a way-point on our path of technological evolution, worthy of fleeting, mild curiosity.
I would suggest the author also look at Amazon Ion:
* It can be used as schema-less
* allows attaching metadata tags to values (which can serve as type hints[1]), and
* encodes blobs efficiently
I have not used it, but in the space of flexible formats it appears to have other interesting properties. For instance it can encode a symbol table making symbols really compact in the rest of the message. Symbol tables can be shared out of band.
Canonical S-expressions seem remarkably similar to bencoding as used in BitTorrent files. They both use length prefixes written in ASCII digits followed by a colon.
Bencoding also manages to specify dictionaries, and yet still have a canonical encoding, by requiring dictionaries be sorted by key (and keys be unique).
It doesn't have the option for arbitrary type names, it just has actual types: integer, bytestring, list and dictionary.
FTA:
> Bencoding offers many of the same benefits of CSEXP, but because it also supports types, is a bit easier to work with.
We changed the url from https://en.wikipedia.org/wiki/Canonical_S-expressions to a non-Wikipedia article. (Wikipedia submissions are fine but if there's a good third-party source, those are usually preferred because they're less generic.)
Canonical S-Expressions aren't meant to replace something like C or assembly, they're for data serialization. It's meant to be compared against things like JSON, XML, ASN.1, or any other serialization format.
Beyond lists and string atoms (or whatever the actual list is), this format also makes an affordance for custom types, but as TFA points out, you still have to roll your own other / higher order data types. Data types that you almost definitely have on hand. Now we are talking about needing to do additional processing on the decoded output, just to interpret common data structures like associative arrays and sets. And as a machine-first serialization format, if you are interchanging with other people or with yourself in the future, sure hope you have full agreement on those custom types.
So what do you do: Add libs? Roll your own? Well, competing alternatives already offer that complete picture as mature, battle-tested solutions. So I'm inclined to view Canonical S-Expressions merely as a way-point on our path of technological evolution, worthy of fleeting, mild curiosity.