Again one of the few advantages of having been round the sun a few more times than most is this isn’t the first time this has happened.
Packages were supposed to replace programming. They got you 70% of the way there as well.
Same with 4GLs, Visual
Coding, CASE tools, even Rails and the rest of the opinionated web tools.
Every generation has to learn “There is no silver bullet”.
Even though Fred Brooks explained why in 1986. There are essential tasks and there are accidental tasks. The tools really only help with the accidental tasks.
AI is a fabulous tool that is way more flexible than previous attempts because I can just talk to it in English and it covers every accidental issue you can imagine. But it can’t do the essential work of complexity management for the same reason it can’t prove an unproven maths problem.
As it stands we still need human brains to do those things.
This seems to apply to all areas of AI in its current form and in my experience 70% may be a bit generous.
AI is great at getting you started or setting up scaffolds that are common to all tasks of a similar kind. Essentially anything with an identifiable pattern. It’s yet another abstraction layer sitting on an abstraction layer.
I suspect this is the reason we are really only seeing AI agents being used in call centers, essentially providing stand ins for chatbots- because chatbots are designed to automate highly repetitive, predictable tasks like changing an address or initiating a dispute. But for things like “I have a question about why I was charged $24.38 on my last statement” you will still be escalated to an agent because inquiries like that require a human to investigate and interpret an unpredictable pattern.
But creative tasks are designed to model the real world which is inherently analog and ever changing and closing that gap of identifying what’s missing between what you have and the real world and coming up with creative solutions is what humans excel at.
Self driving, writing emails, generating applications- AI gets you a decent starting point. It doesn’t solve problems fully, even with extensive training. Being able to fill that gap is true AI imo and probably still quite a ways off.
> But for things like “I have a question about why I was charged $24.38 on my last statement” you will still be escalated to an agent because inquiries like that require a human to investigate and interpret an unpredictable pattern.
Wishful thinking? You'll just get kicked out of the chat because all the agents have been fired.
> You know what is even cheaper, more scalable, more efficient, and more user-friendly than a chatbot for those use cases?
> A run of the mill form on a web page. Oh, and it's also more reliable.
Web-accessible forms are great for asynchronous communication and queries but are not as effective in situations where the reporter doesn't have a firm grasp on the problem domain.
For example, a user may know printing does not work but may be unable to determine if the issue is caused by networking, drivers, firmware, printing hardware, etc.
A decision tree built from the combinations of even a few models of printer and their supported computers could be massive.
In such cases, hiring people might be more effective, efficient, and scalable than creating and maintaining a web form.
> but are not as effective in situations where the reporter doesn't have a firm grasp on the problem domain
Hum... Your point is that LLMs are more effective?
Because, of course people are, but that's not the point. Oh, and if you do create that decision tree, do you know how you communicate it better than with a chatbot? You do that by writing it down, as static text, with text-anchors on each step.
> Because, of course people are, but that's not the point.
Are they?
If the LLMs could talk to grandma for 40 minutes until it figures out what her problem actually is as opposed to what she thinks it is and then transfer her over to a person with the correct context to resolve it, I think that's probably better than most humans in a customer service role. Chatting to grandma being random for an extended amount of time is not something that very many customer service people can put up with day in and day out.
The problem is that companies will use the LLMs to eliminate customer service roles rather than make them better.
Great analysis, and I agree it's Fred Brooks' point all over again.
None of these tools hurt, but you still need to comprehend the problem domain and the tools -- not least because you have to validate proposed solutions -- and AI cannot (yet) do that for you. In my experience, generating code is a relatively small part of the process.
Yeah, its more like it can generate 70% of the code by volume, rather than get you 70% of the way to a complete solution. 12 week projects don't become 4 week projects, at best they are 9-10 week projects.
It’s an old one but I think Joel Spolsky‘s take on leaky abstractions is relevant again in this discussion as we add another abstraction layer with LLM assisted coding.
Agree.
So far "the progress" implied understanding (discovering) previously unknown things. AI is exactly the opposite: "I don't understand how, but it sorta works!"
Where AI really really shine is to help an engineer get proficient in a language they don't know well. Simon Willison says this somewhere and in my experience it's very true.
If you can code, and you understand the problem (or are well on your way to understanding it), but you're not familiar with the exact syntax of Go or whatever, then working with AI will save you hundreds of hours.
If you can't code, or do not (yet) understand the problem, AI won't save you. It will probably hurt.
I used to agree, but as an experienced engineer asking about rust and y-crdt, it sent me down so many wrong rabbit holes with half valid information.
I used Claude recently to refresh my knowledge on the browser history API and it said that it gets cleared when the user navigates to a new page because the “JavaScript context has changed”
I have the experience and know how to verify this stuff, but a new engineer may not, and that would be awful.
Things like these made me cancel all my AI subscriptions and just wait for whatever comes after transformers.
But actually whenever this happens you get rich signal about the hidden didactic assumptions of your method of prompting it about things you yourself are unsure you can verify for yourself, and also of how you thought the tool worked. This is a good meta skill to hone.
A visitor to physicist Niels Bohr's country cottage, noticing a horseshoe hanging on the wall, teased the eminent scientist about this ancient superstition. "Can it be true that you, of all people, believe it will bring you luck?"
"Of course not," replied Bohr, "but I understand it brings you luck whether you believe it or not."
s/luck/lower negative log likelihood. Besides, a participant can still think about and glean truths from their reflections about a conversation they had which contained only false statements.
The internal mechanisms by which a model achieves a low negative log likelihood become irrelevant as it approaches perfect simulation of the true data distribution.
I haven’t seen this demonstrated in gpt-4 or Claude sonnet when asking anything beyond the most extreme basics.
I consistently get subtly wrong answers and whenever I ask “oh okay, so it works like this” I always get “Yes! Exactly. You show a deep understanding of…” even though I was wrong based on the wrong info from the LLM.
Useless for knowledge work beyond RAG, it seems.
Search engines that I need to double check are worse than documentation. It’s why so many of us moved beyond stack overflow. Documentation has gotten so good.
That’s true. But if you are already an experienced developer who’s been around the block enough to call bullshit when you see it, these LLM thingies can be pretty useful for unfamiliar languages. But you need to be constantly vigilant, ask it the right questions (eg: “is this thing you wrote really best practice for this language? Cause it doesn’t seem that way”), and call bullshit on obvious bullshit…
…which sometimes feels like it is more work than just fucking doing it yourself. So yeah. I dunno!
> Where AI really really shine is to help an engineer get proficient in a language they don't know well.
I used GitHub Copilot in a project I started mostly to learn Go. It was amazing. I spent not so much time fiddling around with syntax, and much more time thinking about design.
I guess it depends on how you define "proficiency". For me, proficiency implies a fundamental understanding of something. You're not proficient in Spanish if you have to constantly make use of Google Translate.
Could code assistants be used to help actually learn a programming language?
- Absolutely.
Will the majority of people that use an LLM to write a Swift app actually do this?
- Probably not, they'll hammer the LLM until it produces code that hobbles along and call it a day.
Also, LEARNING is aided by being more active, but relying on an LLM inherently encourages you to adopt a significantly more passive behavior (reading rather than writing).
Not sure I get how that would work. It seems to me that to do my job I will have to validate the semantics of the program, and that means I will have to become familiar with the syntax of Go or whatever, at a fairly sophisticated level. If I am glossing over the syntax, I am inevitably glossing over the fine points of how the program works.
It depends on the language and the libraries you will use. Python with a well known library? Sure no problem. Almost any model will crank out fairly error free boilerplate to get you started.
Terraform? Hah. 4o and even o1 both absolutely sucked at it. You could copy & paste the documentation for a resource provider, examples and all, and it would still produce almost unusable code. Which was not at all helpful given I didn’t know the language or its design patterns and best practices at all. Sonnet 3.5 did significantly better but still required a little hand holding. And while I got my cloud architecture up and running now I question if I followed “best practices” at all. (Note: I don’t really care if I did though… I have other more important parts of my project to work on, like the actual product itself).
To me one of the big issues with these LLM’s is they have zero ability to do reflection and explain their “thought process”. And even if they could you cannot trust what it says because it could be spouting off whatever random training data it hovered up or it could be “aligned” to agree with whatever you tell it.
And that is the thing about LLM’s. They are remarkably good bullshitters. They’ll say exactly what you want them to and be right just enough that they fool you into thinking they are something more than an incredibly sophisticated next token generator.
They are both incredibly overrated and underrated at the same time. And it will take us humans a little while to fully map out what they are actually good at and what they only pretend to be good at.
Yes! Reading some basic documentation on the language or framework, then starting to build in Cursor with AI suggestions works so well. The AI suggests using functions you didn't even know about yet, then you can go read documentation on them to flesh out your knowledge. Learned basic web dev with Django and Tailwind this way and it accelerated the process greatly. Related to the article, this relies on being curious and taking the time to learn any concepts the AI is using, since you can't trust it completely. But it's a wonderfully organic way to learn by doing.
LLMs are a great help with terraform and devops configuration, they often invent things but at least the point at the documentation I need to look up on.
Of course everything needs double-checking but just asking the LLM: "how do I do X" will usually at least output all the names of terrraform resources and most configuration attributes I need to look up.
They are great for any kind of work that requires "magical incantations" as I like to call them.
So very much this. As I was learning Rust, I'd ask what the equivalent was for a snippet I could create in Java. It is funny. I look at the Java code provided by prompts and go meh. The Rust code looks great. I realize this is probably due to 1) me being that junior level in Rust or 2) less legacy crap in the training model. I'm sure it is both, with more of the former as I work from working to beautiful code.
The software development is absolutely a fractal. In 1960s we were solving the complexity by using high level language that compile to machine code to enable more people write simple code. This has happened again and again and again.
But different generations face different problems, which requires another level of thinking, abstraction, and push both boundaries until we reach the next generation. All of this is not solved by a single solution, but the combination based on basic principles that never changes, and these things, at least for now, only human can do.
Interestingly it seems like we are investing many more magnitudes of capital for smaller and smaller gains.
For example, the jump in productivity from adding an operating system to a computer is orders of magnitude larger than adding an LLM to a web development process despite the LLM requiring infrastructure that cost tens of billions to create.
It seems that while tools are getting more and more sophisticated, they aren’t really resulting in much greater productivity. It all still seems to be resulting in software that solves the same problems as before. Whereas when html came around it opened up use cases that has never been seen before despite being a very simple abstraction layer by today’s standards.
Perhaps the opportunities are greatest when you are abstracting the layer that the fewest understand when LLMs seem to assume the opposite.
The real gains in software are still to be had by aggressively destroying incidental complexity. Most of the gunk in a web app doesn't absolutely need to exist, but we write it anyway. (Look at fasthtml for an alternate vision of building web apps.)
The issue with LLMs is they enshrine the status quo. I don't want ossified crappy software that's hard to work with. Frameworks and libraries should have to fight to justify their existence in the marketplace of ideas. Subverting this mechanism is how you ruin software construction.
You mentioned a great point that LLMs are hitting the edge of a marginal gain decreasing point, at least I think so. Many applications are struggling to provide real benefits instead of just entertaining people.
Another funny thing is that we are using LLM to replace creative professionals, but the real creativity is from human experience, perception and our connections, which are exactly missing from LLM.
As someone is not an artist I want ai to do art so I can restore my antique tractor. Of course we all have diffeent hobbies but there are also hobbies we don't want to get into but may nee.
I think the parent comment mean "art" as "having fun", like playing a guitar, definitely no fun to see the robot playing it and not letting you even touch it.
AI generated art/music/etc is the answer to people having creative vision and lacking technical expertise or resources to execute it. There are lots of stories waiting to be told if only the teller had technical ability/time/equipment to tell it. AI will help those stories be told in a palatable way.
Curation of content is also a problem, but if we can come up with better solutions there, generative AI will absolutely result in more and better content for everyone while enabling a new generation of creators.
The AI will also take over your work of restoring antique tractors, much faster and cheaper. It won't be historically accurate, and it may end up with the fuel pump connected to the radio but it'll look mostly Good Enough. The price of broken tractors will temporarily surge as they need them for training data.
If it can create some decal close enough where nobody know the original other than fragmets that remain that helps. For common tractors we know but I'm interested in thing where exactly one is known to exist in the world.
I see it very differently. We are just at the very dawn of how to apply LLMs to change how we work.
Writing dumb scripts that can call out to sophisticated LLMs to automate parts of processes is utterly game changing. I saved at least 200 hours of mundane work this week and it was trivial.
My favorite example of this is grep vs method references in IDEs. Method references are more exact, but grep is much simpler (to implement and to understand for the user).
I think you're also right about LLMs. I think path forward in programming is embracing more formal tools. Incidentally, search for method references is more formal than grepping - and that's probably why people prefer it.
> The software development is absolutely a fractal.
I think this analogy is more apt than you may realize. Just like a fractal, the iterated patterns get repeated on a much smaller scale. The jump to higher-level languages was probably a greater leap then the the rest of software innovation will provide. And with each iterative gain we approach some asymptote, but never get there. And this frustration of never reaching our desired outcome results in ever louder hype cycles.
Too bad most of society is accidental as well. With which I mean to say that there are a lot of nonsensical projects being done out there, that still make a living for many people. Modern AI may well change things, similar to how computers changed things previously.
I get your sentiment, I've been through a few hype cycles as well, but besides learning that history repeats itself, there is no saying how it will repeat itself.
> With which I mean to say that there is a lot of nonsensical projects being done out there, that still make a living for many people.
I don't know why this is a bad thing. I don't think projects that you believe are nonsensical shouldn't exist just because of your opinion, especially if they're helping people survive in this world. I'm sure the people working on them don't think they're nonsensical.
The arts have a place in society. Tackling real problems like hunger or health do too, arguably more so - they create the space for society to tolerate, if not enjoy art.
But the down side is we have a huge smear of jobs that either don't really matter or only matter for the smallest of moments that exist in this middle ground. I like to think of a travel agent of yesteryear as the perfect example: someone who makes a professional experience of organising your leisure so you don't have to; using questionable industry deals. This individual does not have your consumer interests at heart, because being nice to you is not where the profit is generally.
The only role they actually play is rent seeking.
Efficiency threatens the rent seeking models of right now, but at the same time leads to a Cambrian explosion of new ones.
Yeah when you take 2 steps back, ignore IT for a second and look on whole mankind, there are hundreds of millions of jobs that could be called nonsensical from certain points of view. We are not above this in any meaningful ways, maybe its just a bit more obvious to keen eye.
Yet society and economy keeps going and nobody apart from some academic discussions really cares. I mean companies have 100% incentive to trim fat to raise income yet they only do the least minimum.
At this point, I don't think that (truly) AI-informed people believe that AI will replace engineers. But AI tools will likely bring a deep transformation to the workflow of engineers (in a positive and collaborative way).
It may not be tab-tab-tab all the way, but a whole lot more tabs will sneak in.
I think you have that backwards (sort of). The high tier programmers who can write things AI can't will be worth more since they'll be more productive, while the programmers below the AI skill floor will see their value drop since they've been commoditized. We already have a bimodal distribution of salaries for programmers between FAANG/not, this will just exacerbate that.
As somebody who makes extensive use of LLM’s, I very much disagree. Large language models are completely incapable of replacing the kind of stuff you pay a developer $200k for. If anything they make that $200k developer even more of a golden goose.
I suspect you're right, but I think it'll follow the COBOL engineer salary cycle, engineers that have a deeper understanding of the whole widget will be in demand when companies remember they need them.
No, I don’t believe you truly know where AI is right now. Tools like Bolt and v0 are essentially end to end development AIs that actually require very little knowledge to get value out of.
If I could sketch out the architecture I wanted as a flow chart annotated with types and structures, implementable by an AI, that would be a revolutionary leap.
I design top-down, component by component, and sometimes the parts don't fit together as I envisioned, so I have to write adapters or - worst case - return to the drawing board. If the AI could predict these mismatches, that would also be helpful.
Unfortunately, I don't think AI is great with only the accidental tasks either.
AI is really good at goldfish programming. It's incredibly smart within its myopic window, but falls apart as it is asked to look farther. The key is to ask for bite sized things where that myopia doesn't really become a factor. Additionally, you as the user have to consider whether the model has seen seen similar things in the past, as it's really good at regurgitating variations but struggles with novelty.
Maybe we need better terminology. But AI right now is more like pattern-matching than anything I would label as "understanding", even when it works well.
> Even though Fred Brooks explained why in 1986. There are essential tasks and there are accidental tasks. The tools really only help with the accidental tasks.
I don't know this reference, so I have to ask: Was "accidental" supposed to be "incidental"? Because I don't see how "accidental" makes any sense.
Chapter 16 is named "No Silver Bullet—Essence and Accident in Software Engineering."
I'll type out the beginning of the abstract at the beginning of the chapter here:
"All software construction involves essential tasks, the fashioning of the complex conceptual structures that compose the abstract software entity, and accidental tasks, the representation of these abstract entities in programming languages and the mapping of these onto machine languages within space and speed constraints. Most of the big past gains in software productivity have come from removing artificial barriers that have made the accidental tasks inordinately hard, such as severe hardware constraints, awkward programming languages, lack of machine time. How much of what software engineers now do is still devoted to the accidental, as opposed to the essential? Unless it is more than 9/10 of all effort, shrinking all the accidental activities to zero time will not give an order of magnitude improvement."
From the abstract that definitely sounds like he meant "incidental": Something that's a necessary consequence of previous work and / or the necessary but simpler part of the work.
Brooks makes reference to this at some point in a later edition of the book, and about the confusion the word choice caused.
By accidental, he means "non-fundamental complexity". If you express a simple idea in a complex way, the accidental complexity of what you said will be high, because what you said was complex. But the essential complexity is low, because the idea is simple.
Anniversary edition, p182.
"... let us examine its difficulties. Following Aristotle, I divide them into essence - the difficulties inherent in the nature of the software - and accidents - those difficulties that today attends its production but that are not inherent"
I wonder why people no longer write technical books with this level of erudition and insight; all I see is "React for dummies" and "Mastering AI in Python" stuff (which are useful things, but not timeless)
I'm actually writing a book right now, Effective Visualization, and I'll explain why. It is a book focused on Matplotlib and Pandas.
I have almost a dozen viz books. Some written over 50 years ago.
While they impart knowledge, I want the knowledge but also the application. I'm going to go out and paint that bike shed. You can go read Tufte or "Show me the Numbers" but I will show you how to get the results.
Right there is your problem. Read the Mythical Man-Month and Design of Design. They are not long books and it's material that's hard to find elsewhere. Old rat tacit knowledge.
Buy and read the book. There is a reason the 25th aniversery eddition has been still in print for more than 30 years. It is a timeless combuter book that everyone should read and keep an their bookshelf.
Same with 4GLs, Visual Coding, CASE tools, even Rails and the rest of the opinionated web tools.
How many of those things were envisioned by futurists or great authors? This AI stuff is the stuff of dreams, and I think it’s unwise to consider it another go around the sun.
Until it’s actually AI and not
Machine Learning masquerading as AI because AI is the sectors marketing pitch, I would strongly hesitate considering it as anything other than a tool.
Yes, a powerful tool, and as powerful tools go, they can re-shape how things get done, but a tool none the less and therefore we must consider what its limits are, which is all OP is getting at and the current and known near future state suggests we aren’t evolving passed the tool state
This AI stuff? No, not really. The stuff of dreams is an AI that you can talk to and interact infinitely and trust that it doesn’t make mistakes. LLMs ain’t it.
The better tech often lowers the barrier for people to do things but raises the bar of users (and stakeholders for contract projects) expectations. It is plainly visible with web development where the amount of tooling has grown dramatically (both frontend and backend) to do things.
Like, for example, all the big-data stuff we do today was unthinkable 10 years ago, today every mid-sized company has a data team. 15 years ago all data in a single monolithic relational database was the norm, all you needed to know was SQL and some Java/C#/PHP and some HTML to get some data wired up into queries.
The most valuable thing I want AI to do with regards to coding is to have it write all the unit tests and get me to 100% code coverage. The data variance and combinatorics needed to construct all the meaningful tests is sometimes laborious which means it doesn't get done (us coders are lazy...). That is what AI to do, all the mind numbing draining work so I can focus more on the system.
Not necessarily. I have used LLMs to write unit tests based on the intent of the code and have it catch bugs. This is for relatively simple cases of course, but there's no reason why this can't scale up in the future.
LLMs absolutely can "detect intent" and correct buggy code. e.g., "this code appears to be trying to foo a bar, but it has a bug..."
How do you expect AI to write unit tests if it doesn't know the precise desired semantics (specification)?
What I personally would like AI to do would be to refactor the program so it would be shorter/clearer, without changing its semantics. Then, I (human) could easily review what it does, whether it conforms to the specification. (For example, rewrite the C program to give exactly the same output, but as a Python code.)
In cases where there is a peculiar difference between the desired semantics and real semantics, this would become apparent as additional complexity in the refactored program. For example, there might be a subtle semantic differences between C and Python library functions. If the refactored program would use a custom reimplementation of C function instead of the Python function, it would indicate that the difference matters for the program semantics, and needs to be somehow further specified, or it can be a bug in one of the implementations.
I've been having good results having AI "color in" the areas that I might otherwise skimp on like that, at least in a first pass at a project: really robust fixtures and mocks in tests (that I'm no longer concerned will be dead weight as the system changes because they can pretty effectively be automatically updated), graceful error handling and messaging for edgier edge cases, admin views for things that might have only had a cli, etc.
Are we not already more or less there? It is not perfect, to be sure, but LLMs will get you pretty close if you have the documentation to validate what it produces. However, I'm not sure that removes the tedium the parent speaks of when writing tests. Testing is not widely done because it is not particularly fun having to think through the solution up front. As the parent alludes to, many developers want to noodle around with their ideas in the implementation, having no particular focus on what they want to accomplish until they are already in the thick of it.
Mind you, when you treat the implementation as the documentation, it questions what you need testing for?
>AI is a fabulous tool that is way more flexible than previous attempts because I can just talk to it in English
In an era when UIs become ever more Hieroglyphic(tm), Aesthetical(tm), and Nouveau(tm), "AI" revolutionizing and redefining the whole concept of interacting with computers as "Just speak Human." is a wild breath of fresh air.
Programming and interacting with computers in general is just translation to a more restricted and precise language. And that what's make them more efficient. Speaking human is just going the other way and losing productivity.
It's akin to how everyone can build a shelter, but building a house requires a more specialized knowledge. The cost of the later is training time to understand stuff. The cost of programming is also training time to understand how stuff works and how to manipulate them.
An inefficient computer you can use is more productive than an efficient computer you can't use.
Most people can't use mice or keyboards with speed, touchscreens are marginally better except all the "gestures" are unnatural as hell, and programming is pig latin.
Mice and keyboards and programming languages and all the esoteric ways of communicating with computers came about simply because we couldn't just talk Human to them. Democratizing access to computers is a very good and very productive thing.
That's the thing. You don't communicate with computers. You use them. You have a task to do that the computer have been programmed for and what you want is to get the parameters of that tasks to the computer. And you learn how to use the computer because the tasks is worth it, just like you learn how to play a game because you enjoy the time doing it. The task supersedes the tool.
Generative AI can be thought as an interface to the tool, but it's been proven that they are unreliable. And as the article outlines, if it can get to 70% of the task, but you don't have the knowledge requires to complete it, that's pretty much the same as 0%. And if you have the knowledge, more often than not you realize that it just go faster on a zigzag instead of the straight route you would have taken with more conventional tools.
The first lead I worked with inoculated me to this. He taught me about hype trains long before the idea was formalized. He’d been around for the previous AI hype cycle and told me to expect this one to go the same. Which it did, and rather spectacularly. That was three cycles ago now and while I have promised myself I will check out the next cycle, because I actually do feel like maybe next time they’ll build systems that can answer why not just how, this one is a snooze fest I don’t need to get myself involved in.
Just be careful you don'tet your pendulum swing too much in the other direction, where your turn into an old curmudgeon that doesn't get excited by anything and that thinks nothing is novel or groundbreaking anymore.
AI is a potential silver bullet since it can address the "essential complexity" that Fred Books said regular programming improvements couldn't address. It may not yet have caused an "order of magnitude" improvement in overall software development but it has caused that improvement in certain areas, and that will spread over time.
> The tools really only help with the accidental tasks
I don't think that's really the problem with using LLMs for coding, although it depends on how you define "accidental". I suppose if we take the opposite of "essential" (the core architecture, planned to solve the problem) to be boilerplate (stuff that needs to be done as part of a solution, but doesn't itself really define the solution), then it does apply.
It's interesting/amusing that on the surface a coding assistant is one of the things that LLMs appear better suited for, and they are suited for, as far as boilerplate generation goes (essentially automated stack overflow, and similar-project, cut and pasting)... But, in reality, it is one of the things LLMs are LEAST suited for, given that once you move beyond boilerplate/accidental code, the key skills needed for software design/development are reasoning/planning, as well as experienced-based ("inference time") learning to progress at the craft, which are two of the most fundamental shortcomings of LLMs that no amount of scale can fix.
So, yeah, maybe they can sometimes generate 70% of the code, but it's the easy/boilerplate 70% of the code, not the 30% that defines the architecture of the solution.
Of course it's trendy to call LLMs "AI" at the moment, just as previous GOFAI attempts at AI (e.g. symbolic problem solvers like SOAR, expert systems like CYC) were called "AI" until their limitations became more apparent. You'll know we're one step closer to AI/AGI when LLMs are in the rear view mirror and back to just being called LLMs again!
Other options are available, for instance ploughing into a village because your second stage didn't light, or, well, this: https://youtu.be/mTmb3Cqb2qw?t=16
Most of the "you'll never need programmers again!" things have ended up more "cars-showered-with-chunks-of-flaming-HTPB" than "accidentally-land-on-moon", tbh. 4GLs had an anomaly, and now we don't talk about them anymore.
(It's a terrible adage, really. "Oops, the obviously impossible thing didn't happen, but an unrelated good thing did" just doesn't happen that often, and when it does there's rarely a causal relation between A and B.)
AI totally is a silver bullet. If you don't think so, you're just using it wrong and it's your fault. If you think that it takes you just as long or longer to constantly double-check everything it does, then you don't understand the P vs NP problem. </sarcasm>
Hardly, if you worked with the web in the mid 90’s, modern tooling is a much larger improvement than what LLMs bring to the table on their own. Of course they aren’t on their own, people are leveraging generations of improvements and then stacking yet another boost on top of them.
Programming today is literally hundreds of times more productive than in 1950. It doesn’t feel that way because of scope creep, but imagine someone trying to create a modern AAA game using only assembly and nothing else. C didn’t show up until the 70’s, and even Fortran was a late 50’s invention. Go far enough back and people would set toggle switches and insert commands that way no keyboards whatsoever.
Move forward to the 1960’s and people coded on stacks of punch cards and would need to wait for access to a compiler overnight. So just imagine the productivity boost of a text editor and a compiler. I’m not taking an IDE with syntax checks etc, just a simple text editor was a huge step up.
Well, even with more primitive tools people would crete an abstraction of their own for the game - even in very old games you will find some rudimentary scripting languages and abstractions.
Yes that's the point. You needed to do this (accidental) work, in order to do what you actually wanted to achieve. Hence there was less time spend on the actual (~business) problem and hence the whole thing was less productive
Oh I disagree. Like the GP, I’ve been round the block too. And there’s entire areas of computing that we take for granted as being code free now but that used to require technical expertise.
Django/Rails-like platforms revolutionised programming for the web, people take web frameworks for granted now but it wasn't always like that.
And PHP (the programming language) just before that, that was a huge change in "democratising" programming and making it easier, we wouldn't have had the web of the last 20-25 years without PHP.
From what I have seen LLMs are the worst (by far) in terms of gained productivity. I'd rate the simple but type correct auto complete higher than what I get from the "AI" (code that makes little sense and/or doesn't comply)
Supermaven recently suggested that I comment a new file with “This file belongs to {competitor’s URL}.” So, it’s definitely not at the point you can just blindly follow it.
That said, it’s a really nice tool. AI will probably be part of most developer’s toolkits moving forward the way LSP and basic IDE features are.
I wish my ide would type correct the llm. When the funchion doesn't exist look for one with a similar name (often case is differnt or someother thing), also show me the prarmeter option because the llm never gets order right and often skips one.
Going from punched cards to interactive terminals surely must have been a big productivity boost. And going from text based CAD to what is possible on modern workstations has probably also helped a bit in that field.
In that view I'd say the productivity boost by LLMs is somewhat disappointing, especially with respect to how amazing they are.
I think the field is too new and the successful stories too private atm. However I think the best apples to apples example in this context is Amz's codebase update project that they've blogged about.
From memory, they took some old java projects, and had some LLM driven "agents" update the codebase to recent java. I don't know java enough to know how "hard" this task is, but asking around I've heard that "analog" tools for this exist, but aren't that good, bork often, are hardcoded and so on.
Amz reported ~70% of code that came out passed code review, presumably the rest had to be tweaked by humans. I don't know if there are any "classical" tools that can do that ootb. So yeah, that's already imrpessive and "available today" so to speak.
Java is intent as code. It’s so verbose that you have to use an IDE to not go crazy with all the typings. And when using an IDE, you autocomplete more than you type because of all the information that exists in the code
quantifying programmer productivity has been a problem since its inception. lines of code is a terrible metric. so is Jira ticket points. I can tell you that using an LLM, I can make a chrome extension to put a div that says "hello world" at the top of every webpage far quicker than if I had to read the specifications of extension manifests and how to do it manually but how do you quantify that generically? how do you quantify that vs the wasted time because it doesn't understand some nuance of what I'm asking it to do, or when it gets confused about something and goes in circles?
The problem is not what ai can do rather most people in the workforce don't how to use the current generation of Ai. As the children that grew up with using chat gpt etc get into the workforce then only will we see the real benefits of AI.
Oh yeah, the "digital native" myth. I'm not convinced children using ChatGPT to do their homework will actually make them more productive workers. More likely it's going to have the opposite effect, as they're going to lack deeper understanding that you can build only through doing the homework yourself.
Really it's not about just using technology, but how you use it. Lots of adults expected kids with smartphones to be generally good with technology, but that's not what we're witnessing now. It turns out browsing TikTok and Snapchat doesn't teach you much about things like file system, text editing, spreadsheets, skills that you actually need as a typical office worker.
That's different from what I talking about it's the problem of inertia people already in jobs are used to doing them in a particular way. New curious driven people that get into the work force would optimize a lot of office work. A 10-12 year old that has learned how to use Ai from the very start will be using an AI that has 12-15 years of incremental improvements when he or she gets into the work force.
A lot of people here on hacker news disparage newer generations. But how many of you can run a tube based or punched based computer. So if you don't know are you an idiot?
All pieces are there, we just need to decide to do it. Today's AI are able to produce an increasing tangled mess of code. But it's also able to reorganize the code. It's also capable of writing test code, and assess the quality of the code. It's also capable to make architectural decision.
Today's AI code, is more like a Frankenstein's composition. But with the right prompt OODA loop and quality assessment rigor, it boils down to just having to sort and clean the junk pile faster than you produce it.
Once you have a coherent unified codebase, things get fast quickly, capabilities grows exponentially with the number of lines of code. Think of things like Julia Language or Wolfram Language.
Once you have a well written library or package, you are more than 95% there and you almost don't need AI to do the things you want to do.
There is a huge gap in performance and reliability in control systems between open-loop and closed-loop.
You've got to bite the bullet at one point and make the transition from open-loop to closed-loop. There is a compute cost associated to it, and there is also a tuning cost, so it's not all silver lining.
>Once you have a coherent unified codebase, things get fast quickly, capabilities grows exponentially with the number of lines of code. Think of things like Julia Language or Wolfram Language.
>Once you have a well written library or package, you are more than 95% there and you almost don't need AI to do the things you want to do.
That's an idealistic view. Packages are leaky abstractions that make assumptions for you. Even stuff like base language libraries - there are plenty of scenarios where people avoid them - they work for 9x% of cases but there are cases where they don't - and this is the most fundamental primitive in a language. Even languages are leaky abstractions with their own assumptions and implications.
And these are the abstractions we had decades of experience writing, across the entire industry, and for fairly fundamental stuff. Expecting that level of quality in higher level layers is just not realistic.
I mean just go look at ERP software (vomit warning) - and that industry is worth billions.
Packages were supposed to replace programming. They got you 70% of the way there as well.
Same with 4GLs, Visual Coding, CASE tools, even Rails and the rest of the opinionated web tools.
Every generation has to learn “There is no silver bullet”.
Even though Fred Brooks explained why in 1986. There are essential tasks and there are accidental tasks. The tools really only help with the accidental tasks.
AI is a fabulous tool that is way more flexible than previous attempts because I can just talk to it in English and it covers every accidental issue you can imagine. But it can’t do the essential work of complexity management for the same reason it can’t prove an unproven maths problem.
As it stands we still need human brains to do those things.