Quick example, I wanted to make a clickable visible piano keyboard. I told it I was using Vue and asked for the HTML/CSS to do this (after looking at several github and other examples that looked fairly complicated). It spat out code that worked out of the box in about 1m.
I gave it a package.json file that got messed up with many dependencies versions being off from each other, it immediately fixed them up.
I asked it to give me a specific way using BigQuery SQL to delete duplicate rows while avoiding a certain function, again, 1 minute, done.
I have given it broken code and said "this is the error" and it immediately highlights the error and shows a fix.
But given a large project, does any of that really average out to a multiple? Or is just a nice to have? This is what I keep encountering. It's all very focused on boilerplate or templating or small functions or utility. But at some point of code complexity, what is the % of time saved really?
Especially considering the amount of ai babysitting and verification required. AI code obviously cannot be trusted even if it "works."
I watched the video and there wasn't anything new compared to how I used Copilot and ChatGPT for over a year. I stopped because I realized eventually got in the way, and I felt it was preventing me from building the mental model and muscle memory that the early drudge work of a project requires.
I still map ai code completion to ctrl-; but I find myself hardly ever calling it up.
(For the record, I have 25+ years professional experience)
20+ years here. I've been exploring different ways of using them, and this is why I recommend that people start exploring now. IntelliJ with Copilot was an interesting start, but ended up being a bit of a toy. My early breakthrough was asking chat interfaces to make something small, then refine it over a conversation. It's very fast until it starts returning snippets that you need to fold into the previous outputs.
When Claude 3.5 came out with a long context length, you could start pasting a few files in, or have it break the project into a few files, and it would still produce consistent edits across them. Then I put some coins in the API sides of the chat models and started using Zed. Zed lets you select part of a file and specify a prompt, then it diffs the result over the selection and prompts to confirm the replace. This makes it much easier to validate the changes. There's also a chat panel where you can use /commands to specify which files should be included in the chat context. Some of my co-workers have been pushing Cursor as being even more productive. I like open source and so haven't used Cursor yet, but their descriptions of language-aware context are compelling.
The catch is that, whatever you use, it's going to get much better, for free. We haven't seen that since the 90's, so it's easy to brush it off, but models are getting better and there isn't a fkattening trend yet.
So I stand behind my original statement: this time is different. Do yourself a favor and get your hands dirty.
I'm 20+ years here and want to just call out it's a bit of a "moving goal posts with "but given a large project" comment.
I do think it is helpful in large projects, but much less so. I think the other comment gives a good example of how it can be useful, and it seems fairly obvious as context sizes are increasing exponentially in a short amount of time that it will be able to deal with large projects soon.
When using it in larger projects, I'm typically manipulating specific functions or single pages at a time and use a diff tool, so it comes across more as PR that I need to verify or tweak.
Sure, but advocates are talking about multiples of increased productivity, and how can that be defended if it doesn't scale to a project of some size and complexity? I don't care if I get a Nx increase on a small script or the beginning of a project. That's never been the pain point in development for me.
If someone said that, over a significant amount of time and effort, these tools saved them 5% or maybe even 10% then I would say that seems reasonable. But those aren't the kinds of numbers advocates are claiming. And even then, I'd argue that 5-10% comes with a cost in other areas.
And again, not to belabor the point, but where are the in-depth workflows published for senior engineers to get these productivity increases? Not short YouTube videos, but long form books and playlists and tutorials that we can use to replicate and verify the results?
Don't you think that's a little suspect that we haven't been flooded with them like we are with every other new technology?
No, I don't think it's suspect, because it seems like you're looking at a narrow-focus of productivity increase.
"advocates are talking about multiples of increased productivity", some are, some are not, and I don't think most people are, but sure, there's a lot of media hype.
It seems like the argument is akin to many generalized internet arguments "these [vague people] say [generality] about [other thing]".
There are places that I do think that it can make significant, multiples of difference, but it's in short spurts. Taking over code bases, learning a new language, non-tech co-founders can get started without a tech co-founder. I think it's the Jr Engineers that have a greater chance of being replaced, not the sr engineer.
Most recently I use Claude 3.5 projects with this workflow: https://www.youtube.com/watch?v=zNkw5K2W8AQ
Quick example, I wanted to make a clickable visible piano keyboard. I told it I was using Vue and asked for the HTML/CSS to do this (after looking at several github and other examples that looked fairly complicated). It spat out code that worked out of the box in about 1m.
I gave it a package.json file that got messed up with many dependencies versions being off from each other, it immediately fixed them up.
I asked it to give me a specific way using BigQuery SQL to delete duplicate rows while avoiding a certain function, again, 1 minute, done.
I have given it broken code and said "this is the error" and it immediately highlights the error and shows a fix.