I'll stay out of the inevitable "You're just adding a band aid! What are you really trying to do?" discussion since I kind of see the author's point and I'm generally excited about applying LLMs and ML at more tasks.
One thing I've been thinking about is if an agent (or collection of agents) can solve a problem initially in a non-scalable way through raw inference, but then develop code to make parts of the solution cheaper to run.
For example, I want to scrape a collection of sites. The agent would at first apply the whole HTML to the context to extract the data (expensive but it works), but then there is another agent that sees this pipeline and says "hey we can write a parser for this site so each scrape is cheaper", and iteratively replaces that segment in a way that does not disrupt the overall task.
Well, the standard advice for getting off the ground with most endeavours is “Do things that don’t scale”. Obviously scaling is nice, but sometimes it’s cheaper and faster to brute force it and worry about the rest later.
The unscalable thing is often like “buy it cheap, buy it twice” but it’s also often like “buy it cheap, only fix it if you use it enough that it becomes unsuitable”. Makers endorse both attitudes. Knowing when which applies is the challenging bit
Cool idea. It's a bit like what happens in human brains when we develop expertise at something too: start with general purpose behaviors/thinking applied to new specialized task—but if that new specialized task is repeated/important enough you end up developing "specialized circuitry" for it: you can perform the task more efficiently, often without requiring conscious thought.
For example, I want to scrape a collection of sites. The agent would at first apply the whole HTML to the context to extract the data (expensive but it works), but then there is another agent that sees this pipeline and says "hey we can write a parser for this site so each scrape is cheaper", and iteratively replaces that segment in a way that does not disrupt the overall task.