Interesting that the examples use XML for structuring/annotating the prompts. Is there any available comparison of using XML, JSON, or even Markdown for prompts and structured output? Markdown would seem like the prompt format with the least friction/verbosity, but I wonder if it would have a qualitative effect on the model output.
I think we need to differentiate between a few things:
1. The specific way different models were trained.
Claude was specifically trained with these XML tags in the input and output. For Claude it works well to say things along the lines of "use a scratchpad for thinking out loud in XML tags first". While you can force GPT to do the same, it may not be as "natural" e.g. high chance of GPT3.5 not doing it.
2. XML tags vs full XML
Using XML tags doesn't mean it would be amazing at outputting full valid XML. XML tags are great because there's no further formatting needed within. If you only want to separate sections in your prompt or response, XML tags are perfect for clearly marking start and end, as well as featuring syntax (<>) that's rarely seen in natural text.
If it comes to generating a full data schema valid object, my experience is that JSON works way better than XML. It just seems to be a good middle ground between structure while having little overhead. I remember a recent AirBnB posts saying that YAML works even better than JSON. That's not my personal experience. Plus, YAML is something a bit weird and slow on read.
It isn't XML nor JSON, but a DSL built especially for writing prompts. We do not have published benchmarks but running few examples, we see consistent outputs from LLM, and supplements RAG by separating context and using it for enriching prompt.
Looks you still have to make your own template to stringify the prompt that could use JSON/XML/whatever, so this is just stores variations of prompts. Doesn't seem relevant.
It is not just for storing variations of a prompt. It separates context to make intentions clear. Here is a XMl prompt serialized from PromptML program:
> Is there any available comparison of using XML, JSON, or even Markdown for prompts and structured output?
I've done a lot of testing with this and found that XML is the best input AND output if you want to produce machine readable data. Markdown is okay as input, but testing shows better accuracy if input components are wrapped in XML tags. While it would be ideal for extracting data, JSON syntax is too sensitive as output. Often an offending character will be unescaped no matter how good your prompt is.
If you stay within the very basic syntax of XML (just tags) it is also the best output. I suspect it is to do with the descriptive nature of the tag structure. Something wrapped in the tag <summary> will hint to the model that it should produce a summary, reinforcing the input prompt as the output is being constructed.
I really don’t understand why not all LLMs use schema validation on their output 100% guarantee that it is actually JSON and that it matches the expected format.