this is a terrible idea, I can't think of a less efficient method with worse correctness guarantees. What invariants does the LLM enforce? How do you make sure it always does the right thing? How do you debug it when it fails? What kind of error messages will you get? How will it react to bad inputs, will it detect them (unlikely), will it hallicinate an interpretation (most likely)
I used to focus on the potential pitfalls and be overly negative. I've come to see that these tradeoffs are situational. After using them myself, I can definitely see upsides that outweigh the downsides
Developers make mistakes too, so there are no guarantees either way. Each of your questions can be asked of handwritten code too
You can ask those questions, but you won't get the same answers.
It's not a question of "is the output always correct". Nothing is so binary in the real world. A well hand-maintained solution will trend further towards correctness as bugs are caught, reported, fixed, regression tested, etc.
Conversely, you could parse an IP address by rolling 4d256 and praying. It, too, will sometimes be correct and sometimes be incorrect. Does that make it an equally valid solution?
We are chatting about maintaining a software project written in a software programming language. Not some theoretical strawman argument youve just dreamt up because others have rightly pointed out that you don’t need a LLM to parse the output of a 20KB command line program.
As I said before, I maintain a project like this. I also happen to work for a company that specialises in the use of generative AI. So I’m well aware of the power of LLMs as well as the problems of this very specific domain. The ideas you’ve expressed here are, at best, optimistic.
by the time you’ve solved all the little quirks of ML you’ll have likely invested far more time on your LLM then you would have if you’d just written a simple parser and, ironically, needed someone far more specialised to write the LLM than your average developer.
This simply isn’t a problem that needs a LLM chucked at it.
You don’t even need to write lexers and grammars to parse 99% of application output. Again, I know this because I’ve written such software.