> If the SQL query fails to execute (due to a syntax error of some kind) it
> passes that error back to the model for corrections and retries up to three
> times before giving up.
Funnily enough syntax errors is the one thing that you can completely eliminate in LLMs simply by masking the output symbol probability vector to just those which are valid upcoming symbols.
Yeah, that's a pretty solid approach if the LLM you're using (and the third party that hosts it, if you're not self-hosting) supports that.
One minor footgun I've seen with that approach is that while the model is guaranteed to produce syntactically valid outputs, it can still "trap" the model into outputting something both wrong and low-probability if you design your schema badly in specific ways (contrived example: if you're doing sentiment analysis of reviews and you have the model pick from the enumeration ["very negative", "negative", "slightly negative", "neutral", "positive"], then the model might encounter a glowing review and write "very", intending to follow it up with " positive", but since " positive" isn't a valid continuation it ends up writing "very negative" instead).