> You can tell it to output in a JSON structure (or some other format) of your c...

btown · 2024-05-21T21:12:27 1716325947

Beyond this, an LLM can easily become confused even if outputting JSON with a valid schema. For instance, we've had mixed results trying to get an LLM to report structured discrepancies between two multi-paragraph pieces of text, each of which might be using flowery language that "reminds" the LLM of marketing language in its training set. The LLM often gets as confused as a human would, if the human were quickly skimming the text and forgetting which text they're thinking about - or whether they're inventing details from memory that are in line with the tone of the language they're reading. These are very reasonable mistakes to make, and there are ways to mitigate the difficulties with multiple passes, but I wouldn't describe the outputs as highly reliable!

simonw · 2024-05-21T21:27:38 1716326858

I would have agreed with you six months ago, but the latest models - Claude 3, GPT-4o, maybe Llama 3 as well - are much more proficient at outputting JSON correctly.

pilgrim0 · 2024-05-22T03:09:09 1716347349

Seems logical that they will always implement specialized pathways for the most critical and demanding user base. At some point they might even do it all by hand and we wouldn’t know /s

iknownthing · 2024-05-22T13:11:02 1716383462

This was my experience as well. The only reliable method I found was to use the LLM to generate the field values then put it into a json myself.

davedx · 2024-05-22T05:45:42 1716356742

Yes, I'm using them quite extensively with my day to day work for extracting numerical data from unstructured documents. I've been manually verifying the JSON structure and numerical outputs and it's highly accurate for the corpus I'm processing.

FWIW I'm using GPT4o not Llama, I've tried Llama for local tasks and found it pretty lacking in comparison to GPT.

omneity · 2024-05-22T02:44:58 1716345898

Your comment has an unnecessary and overly negative tone to it that doesn't do this tech justice. These approaches are totally valid and can get you great results. An LLM is just a component in a pipeline. I deployed many of these in production without a hiccup.

Guidance (the industry term for "constraining" the model output) is only there to ensure the output follows a particular grammar. If you need JSON to fit a particular schema or format, then you can always validate it. In case of validation failure you can always pass the JSON and the validation result back to the LLM for it to correct it.

msp26 · 2024-05-21T22:12:56 1716329576

> Have you tried to use LLMs to generate structured JSON output? Not only do all LLMs suck at relaibly following a schema, you need to use all kinds of "forcing" to make sure the output is actually JSON anyway.

Yeah it's worked about fifty thousand times for me without issues in the past few months for several NLP production pipelines.

Kiro · 2024-05-22T06:47:27 1716360447

I'm generating hundreds of JSONs a day with OpenAI and it has no problem following a schema defined in TypeScript.