Interesting post, I've not used YAML outputs as of yet. When using GPT3.5 for JSON, I found that requesting minified JSON reduces the token count by a significant amount. In the example you mention, the month object minified is 28 tokens vs 96 tokens formatted. It actually beats the 50 Tokens returned from YAML.
It seems like the main issue is whitespace and indentation which YAML requires unlike JSON.
Yes, minified JSON would be even less tokens than YAML.
But:
1- LLMs tend to have very hard time to produce minified (compacted) JSON in the output, consistently.
2- As for compacted JSON input- Empirically it seems that LLMs can process it quite well for basic cognitive tasks (Information Retrieval, basic Q&A, etc), but when it comes to bit more sophisticated tasks it fails compared to exactly the same input, uncompressed. I've mentioned and provided examples in the comments of this article.
It seems like the main issue is whitespace and indentation which YAML requires unlike JSON.