I’m not a proompter but why can’t LLMs have an initial step to generate an effective prompt from user input, clarifying if intent is not clear enough, and then feed that prompt back to the LLM to better fulfill the user’s request? Seems like people have been busy generating training data on what an effective prompt looks like, and kind of silly to require people to learn some verbal gymnastics when the LLM itself is specialized at writing in specific styles.
It is already done by some platforms (for example ChatGPT generates a prompt based on your query when generating images).
The downside is that you might lose information (the improved prompt might not be an exact semantic match with the original prompt), plus it becomes less and less useful as model's ability to understand commands improve.
Basically, if you express yourself clearly, betting on model improvement is enough, there is no need to add a faillible extra-step.
These guys came up a with a "preprompt" for ChatGPT that does a decent job in refining your initial prompt and asking questions for clarification. Seems they mix some of the best practices like chain of thought, etc.
Interesting that the examples use XML for structuring/annotating the prompts. Is there any available comparison of using XML, JSON, or even Markdown for prompts and structured output? Markdown would seem like the prompt format with the least friction/verbosity, but I wonder if it would have a qualitative effect on the model output.
I think we need to differentiate between a few things:
1. The specific way different models were trained.
Claude was specifically trained with these XML tags in the input and output. For Claude it works well to say things along the lines of "use a scratchpad for thinking out loud in XML tags first". While you can force GPT to do the same, it may not be as "natural" e.g. high chance of GPT3.5 not doing it.
2. XML tags vs full XML
Using XML tags doesn't mean it would be amazing at outputting full valid XML. XML tags are great because there's no further formatting needed within. If you only want to separate sections in your prompt or response, XML tags are perfect for clearly marking start and end, as well as featuring syntax (<>) that's rarely seen in natural text.
If it comes to generating a full data schema valid object, my experience is that JSON works way better than XML. It just seems to be a good middle ground between structure while having little overhead. I remember a recent AirBnB posts saying that YAML works even better than JSON. That's not my personal experience. Plus, YAML is something a bit weird and slow on read.
It isn't XML nor JSON, but a DSL built especially for writing prompts. We do not have published benchmarks but running few examples, we see consistent outputs from LLM, and supplements RAG by separating context and using it for enriching prompt.
Looks you still have to make your own template to stringify the prompt that could use JSON/XML/whatever, so this is just stores variations of prompts. Doesn't seem relevant.
It is not just for storing variations of a prompt. It separates context to make intentions clear. Here is a XMl prompt serialized from PromptML program:
> Is there any available comparison of using XML, JSON, or even Markdown for prompts and structured output?
I've done a lot of testing with this and found that XML is the best input AND output if you want to produce machine readable data. Markdown is okay as input, but testing shows better accuracy if input components are wrapped in XML tags. While it would be ideal for extracting data, JSON syntax is too sensitive as output. Often an offending character will be unescaped no matter how good your prompt is.
If you stay within the very basic syntax of XML (just tags) it is also the best output. I suspect it is to do with the descriptive nature of the tag structure. Something wrapped in the tag <summary> will hint to the model that it should produce a summary, reinforcing the input prompt as the output is being constructed.
I really don’t understand why not all LLMs use schema validation on their output 100% guarantee that it is actually JSON and that it matches the expected format.
that's right, i've mentioned this on the first page of the tutorial. if you don't enable interactive mode, the experience is the same as reading the answer sheet.
i converted the content to a web-friendly format as a personal learning exercise. hopefully it improves the accessibility as well.
Sorry, I had missed that. I read the intro up to "When you are ready to begin, click on the Basic Prompt Structure to proceed.", and then I clicked. So I didn't see the explanatory text that was below the fold.