Interesting that the examples use XML for structuring/annotating the prompts. Is...

pietz · 2024-05-18T10:54:14 1716029654

I think we need to differentiate between a few things:

1. The specific way different models were trained.

Claude was specifically trained with these XML tags in the input and output. For Claude it works well to say things along the lines of "use a scratchpad for thinking out loud in XML tags first". While you can force GPT to do the same, it may not be as "natural" e.g. high chance of GPT3.5 not doing it.

2. XML tags vs full XML

Using XML tags doesn't mean it would be amazing at outputting full valid XML. XML tags are great because there's no further formatting needed within. If you only want to separate sections in your prompt or response, XML tags are perfect for clearly marking start and end, as well as featuring syntax (<>) that's rarely seen in natural text.

If it comes to generating a full data schema valid object, my experience is that JSON works way better than XML. It just seems to be a good middle ground between structure while having little overhead. I remember a recent AirBnB posts saying that YAML works even better than JSON. That's not my personal experience. Plus, YAML is something a bit weird and slow on read.

nyell · 2024-05-19T00:41:33 1716079293

Hi there. Your question is valid. That's why we built Prompt Markup Language:

https://github.com/narenaryan/promptml/ https://www.promptml.org/

It isn't XML nor JSON, but a DSL built especially for writing prompts. We do not have published benchmarks but running few examples, we see consistent outputs from LLM, and supplements RAG by separating context and using it for enriching prompt.

KTibow · 2024-05-19T01:25:49 1716081949

Looks you still have to make your own template to stringify the prompt that could use JSON/XML/whatever, so this is just stores variations of prompts. Doesn't seem relevant.

nyell · 2024-05-19T01:37:43 1716082663

It is not just for storing variations of a prompt. It separates context to make intentions clear. Here is a XMl prompt serialized from PromptML program:

https://gist.github.com/narenaryan/651d8081eaaffa846e05da7a3...

You can test it with Claude and GPT-3.5 or GPT-4o, and they all will bring strikingly bring similar results with differences in detail.

YAML & XML serializations are coming in v0.6.0 of PromptML.

Alifatisk · 2024-05-21T19:44:13 1716320653

Looks like Ruby without the @

NeatPasta · 2024-05-18T07:22:26 1716016946

> Is there any available comparison of using XML, JSON, or even Markdown for prompts and structured output?

I've done a lot of testing with this and found that XML is the best input AND output if you want to produce machine readable data. Markdown is okay as input, but testing shows better accuracy if input components are wrapped in XML tags. While it would be ideal for extracting data, JSON syntax is too sensitive as output. Often an offending character will be unescaped no matter how good your prompt is.

If you stay within the very basic syntax of XML (just tags) it is also the best output. I suspect it is to do with the descriptive nature of the tag structure. Something wrapped in the tag <summary> will hint to the model that it should produce a summary, reinforcing the input prompt as the output is being constructed.

jiggawatts · 2024-05-19T00:17:58 1716077878

I really don’t understand why not all LLMs use schema validation on their output 100% guarantee that it is actually JSON and that it matches the expected format.

nl · 2024-05-19T02:59:21 1716087561

Function calling in OpenAI models is essentially this. You define the functions using JSON Schema and it is guaranteed to match.

See also https://python.useinstructor.com/

ai_what · 2024-05-18T13:46:59 1716040019

This is really good to know, thanks! Was this for every LLM or only Claude?