Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

“As a designer…”

IMHO the bleeding edge of what’s working well with LLMs is within software engineering because we’re building for ourselves, first.

Claude code is incredible. Where I work, there are an incredible number of custom agents that integrate with our internal tooling. Many make me very productive and are worthwhile.

I find it hard to buy in to opinions of non-SWE on the uselessness of AI solely because I think the innovation is lagging in other areas. I don’t doubt they don’t yet have compelling AI tooling.



I'm a SWE, DBA, SysAdmin, I work up and down the stack as needed. I'm not using LLMs at all. I really haven't tried them. I'm waiting for the dust to settle and clear "best practices" to emerge. I am sure that these tools are here to stay but I am also confident they are not in their final form today. I've seen too many hype trains in my career to still be jumping on them at the first stop.


It's time to jump on the train. I'm a cranky, old, embedded SWE and claude 4.5 is changing how I work. Before that I laughed off LLMs. They were trash. Claude still has issues, but damn, I think if I don't integrate it into my workflow I'll be out of work or relegated to work in QA or devops(where I'd likely be forced to use it).

No, it's not going to write all your code for you. Yes your skills are still needed to design, debug, perform teamwork(selling your designs, building consensus, etc), etc.. But it's time to get on the train.


The moment your code departs from typical patterns in the training set or ("agentic environment") LLMs fall over at best (i.e. can't even find the thing) or do some random nonsense at worst.

IMO LLMs are still at the point where they require significant handholding, showing what exactly to do, exactly where. Otherwise, it's constant review of random application of different random patterns, which may or may not satisfy requirements, goals and invariants.


A contrarian opinion - I also didn't jump the train yet, its even forbidden in our part of company due to various regulations re data secrecy and generally slow adoption.

Also - for most seasoned developers, actual dev activity is miniscule part of overall efforts. If you are churning code like some sweatshop every single day at say 45 its by your own choice, you don't want to progress in career or career didn't push you up on its own.

What I want to say - that miniscule part of the day when I actually get my hands on the code are the best. Pure creativity, puzzle solving, learning new stuff (or relearning when looking at old code). Why the heck would I want to lose or dilute this and even run towards it? It makes sense if my performance is rated only based on code output, but its not... that would be a pretty toxic place to be polite.

Seniority doesn't come from churning out code quicker. Its more long the lines of communication, leading others, empathy, toughness when needed, not avoiding uncomfortable situations or discussions and so on. No room for llms there.


There have been times when something was very important and my ability to churn out quick proof of concept code (pre AI) made the difference. It has catapulted me. I thought talking was all-important, but turns out, there's already too much talk and not enough action in these working groups.

So now with AI, that's even quicker. And I can do it more easily during the half relevant part of meetings, which I have a lot more of nowadays. When I have real time to sit and code, I focus on the hardest and most interesting parts, which the AI can't do.


> ability to churn out quick proof of concept code (pre AI) made the difference. It has catapulted me. I thought talking was all-important

It is always the talking that transitions "here's quick proof of concept" to "someone else will implement this fully and then maintain". One cannot be catapulted if they cannot offload the implementation and maintenance. Two quick proof of concept ideas you are stuck with and it's already your full capacity. one either talks their way out to having a team supporting them or they find themselves on a PIP with a regular backlog piling up.


Oh they hired some guys to productionize it, who did it manually back then but now delegate a lot of it to AI.


Your comment is basically paraphrasing my responses to older threads on HN telling me I needed to use vibe coding.

Most of my day isn't coding. But sometimes it is. On those days, AI helps me get back to doing the important stuff. Sure, I like solving problems and writing code, but where I add value to my company is in bringing solutions to users and getting them into production.

I'm a systems/embedded engineer. In my 30 years of being employed I've written very little code, relatively speaking. I am not a code monkey cranking out thousands of lines per day or even week. AI is like having an on-demand intern who can do that if I need to, however. I basically gained an employee for free. AI can also saving me time debugging because look, I'm old and I really don't write all that much code. I mess up syntax sometimes. I can't remember some stupid C++ rule or best-practice sometimes. Now I don't have to read a book or google it.

AI is letting me put my experience and intuition to work much more efficiently than ever before and it's pretty cool.


While I think that is true and this thread here is about senior productivity switching to LLMs, I can say from my experience that our juniors absolutely crush it using LLMs. They have to do pretty demanding and advanced stuff from the start and they are using LLMs nonstop. Not sure how that translates into long term learning but it definitely increases their output and makes them competent-enough developers to contribute right away.


> Not sure how that translates into long term learning

I don't think that's a relevant metric. "learning" rate of humans versus LLMs. If you expect typical LLMs to grow from juniors to competent mids and maybe even seniors faster than typical human, then there is little point to learn to write code, but rather learn "software engineering with artificial code monkey". However, if that turns out to not be true, we have just broken the pipeline producing actual mids and seniors, who can actually oversee the LLMs.


> Seniority doesn't come from churning out code quicker. Its more long the lines of communication, leading others, empathy, toughness when needed, not avoiding uncomfortable situations or discussions and so on. No room for llms there.

they might be poor at it, but if you do everything you specified online and through a computer, then its in an LLMs domain. If we hadnt pushed so hard for work from home it might be a different story. LLMs are poor on soft skills but is that inherent or just a problem that can be refined away? i dont know


> What I want to say - that miniscule part of the day when I actually get my hands on the code are the best.

And if you are not "churning code like some sweatshop every single day" those hours are not "hey, let's bang out something cool!", it's more like "here are 5 reasons we can't do the cool thing, young padawan".


This has not been my experience as of late. If anything, they help steer us back on track when a SWE decides to be clever and introduce a footgun.


Same, I have Gemini Pro 2.5 (now 3) exclusively implementing new designs that don't exist in the world and it's great at it! I do all the design work, it writes the code (and tests) and debugs the thing.

I'm happy, it's happy, I've never been more productive.

The longer I do this, the more likely it is to one-shot things across 5-10 files with testing passing on the first try.


> The moment your code departs from typical patterns

Using AI, I constantly realize that a-typical patterns are much rarer than I thought.


Yeah but don't let it prevent you from making a novel change just because no one else seems to be doing it. That's where innovation sleeps.


I was meaning: even if I do somehow "advanced" stuff that not many people do, much of it is pretty common.

Humbling.


I don't think anyone disagrees with that. But it's a good time to learn now, to jump on the train and follow the progress.

It will give the developer a leg up in the future when the mature tools are ready. Just like the people who surfed the 90s internet seem to do better with advanced technology than the youngsters who've only seen the latest sleek modern GUI tools and apps of today.


Quite frankly - the majority of code is improved by integrating some sort of pattern. a LLM is great at bringing the pattern you may not have realized you are making into the forefront.

I think there's an obsession, especially in more veteran SWEs to think they are creating something one of a kind and special, when in reality, we're just iterating over the same patterns.


This was true since Claude Sonnet 3.5, so over a year now. I was early on the LLM train building RAG tools and prototypes in the company I was working at the time, but pre Claude 3.5 all the models were just a complete waste of time for coding, except the inline autocomplete models saved you some typing.

Claude 3.5 was actually where it could generate simple stuff. Progress kind of tapered off since tho, Claude is still best but Sonnet 4.5 is disappointing in that it does't fundamentally bring me more than 3.5 did it's just a bit better at execution - but I still can't delegate higher level problems to it.

Top tier models are sometimes surprisingly good but they take forever.


This was true since ChatGPT-1, and I mean the lies.


Not really - 3.5 was the first model where I could actually use it to vibe through CRUD without it vasting more time than it saves, I actually used it to deliver a MVP on a side gig I was working on. GPT 4 was nowhere near as useful at the time. And Sonnet 3 was also considerably worse.

And from reading through the forums and talking to co-workers this was a common experience.


Up until using claude 4.5 I had very poor experiences with C/C++. Sure, bash and python worked ok, albeit the occasional hallucination. ChatGPT-5 did ok with C/C++ and fairly well with python(again having issues with omitting code during iterations, requiring me to yell at it a lot). Claude 4.5 just works and it's crazy good.


It's just not true, it is not ready.

Especially Claude, where if you check the forums everyone is complaining that it's gone stupid the last few months.

Claude's code is all over the place, and if you can't see that and are putting it's code into production I pity your colleagues.

Try stopping. Honestly, just try. Just use claude as a super search engine. Though right now ChatGPT is better.

You won't see any drop in productivity.


It's not about blindly accepting autogenerated code. Its using them for tooling integration.

Its like terminal autocomplete on steroids. Everything around the code is blazing fast.


This is far too simplistic a viewpoint. First of all it depends what you're trying to do. Web dev? AI works pretty well. CPU design? Yeah good luck with that.

Secondly it depends what you're using it for within web dev. One shot an entire app? I did that recently for a Chrome extension and while it got many things wrong that I had to learn and fix, it was still waaaaaay faster than doing it myself. Especially for solving stupid JS ecosystem bugs.

Nobody sane is suggesting you just generate code and put it straight into production. It isn't ready for that. It is ready for saving you a ton of time if you use it wisely.


I'd say it was pretty naunced. Use it, but don't vibe code. The crux of the issue is that unless you're still writing the code it's too hard to notice when Claude or Codex makes a mountain out of a mole hill, too easy to miss the subtle bugs, too easy to miss the easy abstractions which would have vastly simplified the code.

And I do web dev, the code is rubbish. It's actually got subtle problems, even though it fails less. It often munges together loads of old APIs or deprecated ways of doing things. God forbid you need to deal with something like react router or MUI as it will combine code from several different versions.

And yes, people are using these tools to directly put code in. I see devs DOING it. The code sucks.

Vibe coded PRs are a huge timesink that OTHER people end up fixing.

One guy let it run and it changed code in an entirely unrelated part of the system and he didn't even notice. Worse, when scanning the PR it looked reasonable, until I went to fix a 350 line service Claude or codex had puked out that could be rewritten in 20 lines, and realized the code files were in an entirely different search system.

They're also generally both terrible at abstracting code. So you end up with tons of code that does sweet FA over and over. And the constant over engineering and exception handling theatre it does makes it look like it's written a lot of code when it's basically turned what should be a 5 liner into an essay.

Ugh. This is like coding in the FactoryFactoryFactory days all over again.


I don’t use AI for anything except translations and searching and I’d say 3 times out of 10 it gives me bad information, while translation only works ok if you use the most expensive models


I am an old SWE and claude 4.5 has not changed a thing on how we work.

The teams that have embraced AI in their worlflow have not increased their output compared with they ones that don't use it.


I'm like you. I'd say my productivity improved by 5-10%: Claude can make surprinsgly good code edits. For these, my subjective feeling is that claude does in 30 min what I'd have dont in one hour. It's a net gain. Now, my job is about communicating, understanding problems, learning, etc. So my overall productivity is not dramatically changing, but for things related to code, it's a net 5-10%


Which is where the AI investment disconnect is scary.

AI Companies have invested a crazy amount of money into a small productivity gain for their customers.

If AI was replacing developers it wouldn’t cost me $20-100/month to get a subscription.


If I had something that could replace developers I will not sell it.

I will get all the goverment IT contracts and make billions in a few months.

Nobody does it because LLMS are a fucking scam, like crypto, and I am tired of pretending is not.


Two things I solidly recommend it for are helper scripts and test code. Writing boilerplate tests is so much easier with it. I recently vibe coded an entire text-ui frontend to perforce, similar to tig for git, and it took about 2 hours. Yeah, I'm not ready to use it for our driver code yet but I would use it to check my work.


What you're doing there with Perforce sounds like procrastination to me.

https://xkcd.com/1205/

Good for you, though.


I'm a SWE and also an art director. I have tried these tools and, the way I've also tried Vue and React, I think they're good enough for simple minded applications. It's worth the penny to try them and look through the binoculars, if only to see how unoriginal and creatively limited what most people in your field are actually doing if they find this something that saves them time.


What a condescending take.


Why would you wait for dust to settle down? Just curious. Productivity gains are real in current form of LLMs. Guardrails and best practices can be learnt and self imposed.


> Productivity gains are real in current form of LLM

I haven't found that to be true

I'm of the opinion that anyone who is impressed by the code these things produce is a hack


I just started a project, they fired the previous team, I am possitive they used AI. The app is full of bugs and the client will never hire the old company again.

Whoever says is time to move to LLMS is clueless.


Humans are very capable of creating bugs. This in itself is not a tell.


"Because one team doesn't know how to use LLMs, I conclude that LLMs are useless."


Can you show any product created by a those imaginary teams using LLMS?



I am talking about real work. Who is going to pay me to build that?

I have not even seen a CRUD app with real users wrote using AI tools.


You proved his point bro. That site does not even load.


Shrug, loads fine for me.


That site is in Cloudfare am I right?

Cloudfare gets blocked in some parts of Europe on the weekend... Only the DNS, not really blocked.

Football is more important.


I think there's an issue with IPv6 resolution, but I do think it's a Cloudflare issue, as they're the ones who insert that IPv6 hostname I can't resolve. This doesn't happen with any of my other sites, though, it's odd.


If it works tomorrow you know why.

They block random Cloudfare IPs related with sport streaming sites.


Are you in Spain? I've seen it in Greece, but only on one of my devices.


I am not in Spain. But I am Spanish. I think it gets blocked in Spain and in Italy.


Whenever I hear about productivity gains, I mentally substitute it for "more time to play video games left in the day" to keep the conversation grounded. I would say I rather not.


If you have two modes of spending your time, one being work that you only do because you are paid for it, and the other being feeding into an addiction, the conversations you should be having are not about where to use AI.


Your "productivity gains" is just equal to the hours others eventually have to spend cleaning up and fixing what you generated.


I'm surprised these pockets of job security still exist.

Know this: someone is coming after this already.

One day someone from management will hear about a cost-saving story at a dinner table, the words GPT, Cursor, Antigravity, reasoning, AGI will cause a buzzing in her ear. Waking up with tinnitus the next morning, they'll instantly schedule a 1:1 to discuss "the degree of AI use and automation"


> Know this: someone is coming after this already.

Yesterday, GitHub Copilot declared that my less-AI-weary friend’s new Laravel project was following all industry best-practices for database design as it storing entities as denormalized JSON blobs in a MySQL 8.x database with no FKs, indexes, constraints, all NULL columns (and using root@mysql as the login, of course); while all Laravel controller actions’ DB queries were RBAR loops that did loaded all rows into memory before doing JSON deserialisation in order to filter rows.

I can’t reconcile your attitude with my own personal lived experience of LLMs being utterly wrong 40% of the time; while 50% of the time being no better or faster than if I did things myself; another 5% of the time it gets stuck in a loop debating the existence of the seahorse emoji; and the last 5% of the time genuinely utterly scaring me with a profoundly accurate answer or solution that it produced instantly.

Also, LLMs have yet to demonstrate an ability to tackle other real-world DBA problems… like physically installing a new SSD into the SAN unit in the rack.


Lowballing contracts is nothing new. It has never ever worked out.

You can trow all AI you want, but at the end of the day you get what you pay for.


No harm in running them in isolated instances and see what happens.

Feed an LLM stack traces or ask it to ask you questions about a topic you're unfamiliar about. Give it a rough hypothesis and demand it poke holes in it. These things it does well. I use Kagi's auto summariser to distil search results in to a hand full of paragraphs and then read through the citations it gives me.

Know that LLMs will suck up to you and confirm your ideas and make up bonkers things a third of the time.


>I really haven’t tried them.

You are doing yourself a huge disservice.


Nothing is in its "final form" today.

I'm a long time SWE and in the last week, I've made and shipped production changes across around 6 different repos/monorepos, ranging from Python to Golang, to Kotlin to TS to Java. I'd consider myself "expert" in maybe one or two of those codebases and only having a passing knowledge of the others.

I'm using AI, not to fire-and-forget changes, but to explain and document where I can find certain functionality, generate snippets and boilerplate, and produce test cases for the changes I need. I read, review and consider that every line of code I commit has my name against it, and treat it as such.

Without these tools I'd estimate being around 25% as effective when it comes to getting up to speed on unfamiliar code and service. For that alone, AI tooling is utterly invaluable.


The tools have reached the point where no special knowledge is required to get started. You can get going in 5 minutes. Try Claude Code with an API key (no subscription required). Run it in the terminal in a repo and ask how something works. Then ask it to make a straightforward but tedious change. Etc.


Just download Gemini (no API key) and use it.


I’m in the same position, but I use AI to get a second opinion. Try it by using the proper models, like Gemini 3 Pro that was just released and include grounding. Don’t use the free models, you’ll be surprised at how valuable it can be.


I hope I am never this slow to adapt to new technologies.


Right now I see "use AI" to be in the same phase as "add Radium" was shortly after Curie's discovery. A vial of magic pixie dust to sprinkle on things, laden with hidden dangers very few yet understand. But I also keep in mind that radioactivity truly transformed some very unexpected fields.[ß]

AI and LLMs are tools. The best tools tend to be highly focused in their application. I expect AI to eventually find its way to various specific tool uses, but I have no crystal ball to predict what those tools might be or where they will surface. Although I have to say that I have seen, earlier this week, the first genuinely interesting use-case for AI-powered code generation.

A very senior engineer (think: ~40 years of hands-on experience) had joined a company and was frustrated by lack of integration tests. Unit tests, yes. E2E test suite, yes. Nothing in between. So he wrote a piece of machinery to automatically test integration between a selected number of interacting components, and eventually was happy with the result. But since that was only a small portion of the stack, he would have had to then replicate that body of work for a whole lot of other pieces - and thought "I could make AI repeat this chore".

The end result is a set of very specific prompts, constraints, requirements, instructions, and sequences of logical steps that tell one advanced model what to do. One of the instructions is along the lines of "use this body of work I wrote $OVER_THERE as a reference". That the model is building iteratively a set of tests that self-validate the progress certainly helps. The curious insight in the workflow is that once the model has finished, he then points the generated body of work to another advanced model from a different vendor, and has that do an automated code review, again using his original work as a reference material. And then feeds that back to the first model to fix things.

That means that he still has to do the final review of the results, and tweak/polish parts where the two-headed AI went off the rails. But overall the approach saves quite a lot of time and actually scales pretty much linearly to the size of the codebase and stack depth. To quote his presentation note, "this way AI works as a highly productive junior that can follow good instructions, not as a misguided oracle that comes up with inventive reinterpretations."

He made modern AI repeat his effort, but crucially he had had to do the work at least once to know precisely what constraints would apply. I suspect that eventually we'll be seeing more of these increasingly sophisticated but very narrowly tailored tooling use cases to pop up. The best tools are after all focused, even surgical.

ß: Who could have predicted in 1900 that radioactive compounds would change fields ranging from medicine to food storage?


AI - like asbestos but radioactive!


How could you not at least try?


I work as an SRE, I have tried LLMs, they barely work. You're not missing out.

Or, more correctly, they don't work well for my problems or usage. They can at best answer basic questions, stuff you could lookup using a search engine, if you knew what to look for. They can also generate code for inspiration, but you'll end up rewriting all of it before you're done. What they can't do it solve your problem start to end. They really do need a RTFM mode, where they will just insult you if you're approach or design is plain wrong or at least just let you know that it will now stop helping as you're clearly of the rails.

We need to bubble to pop, it'll be a year or two, the finance bros aren't done extracting value from the stonks. Once it does, we can focus on what's working and what isn't and refine the good stuff.

Right now the LLMs are the product, and they can't be, it makes no sense. They need to be embedded within product, either as a built in feature, e.g. CoPilot in Visual Studio, or as plugins, like LSPs.

Clearly others are having more luck with LLMs than I do, and do amazing projects, but that sort of illustrates the point, their aren't ready and we don't have a solution for them to be universally useful (and here I'm even restricting myself to thinking about coding).


You don’t have to jump on the hype train to get anything out of it. I started using claude code about 4 months back and I find it really hard to imagine developing without now. Sure I’m more of a manager, but the tedious busywork, the most annoying part of programming, is entirely gone. I love it.


> tedious busywork

Can you give some examples, please?


> I'm not using LLMs at all

You’re deliberately disadvantaging yourself by a mile. Give it a go

… the first one’s free ;)


All I see it doing, as a SWE, is limiting the speed at which my co-workers learn and worsening the quality of their output. Finally many are noticing this and using it less...


I recently had a very interesting interaction in a few small startups I freelanced for recently.

In a 1-year company, the only tech person that's been there for more than 3-4 months (the CTO), only really understands a tiny fraction of the codebase and infrastructure, and can't review code anymore. Application size has blown up tremendously despite being quite simple. Turnover is crazy and people rarely stay for more than a couple months. The team works nights and weekends, and sales is CONSTANTLY complaining about small bugs that take weeks to solve.

The funny thing is that this is an AI company, but I see the CTO constantly asking developers "how much of that code is AI?". Paranoia has set in for him.


>Turnover is crazy and people rarely stay for more than a couple months. The team works nights and weekends

Oh, look, you've normalized deviance. All of these things are screaming red flags, the house is burning down around you.


This sounds just like a typical startup or small consultancy drunk on Ruby gems and codegen (scaffolding) back in the Rails heyday.

People who don’t yet have the maturity for the responsibility of their roles, thinking that merely adopting a new technology will make up for not taking care of the processes and the people.


Bingo. The founders have no maturity, responsibility and believe they "made it" because they got somewhere AI. Now they're pushing back against AI because they can't understand the app anymore.


Your probably bosses think it's worth it if the outcome is getting rid of the whole host of y'all and replace you with AWS Elastic-SWE instances. Which is why it's imperative that you maximize AI usage.


They’ll be replaced with cheaper humans in Mexico using those Copilot seats, that’s much more tangible and obvious, no need to wait for genius level AI


So instead of firing and replacing me with AI my boss will pay me to use AI he would've used..?


No one's switching to AI cold turkey. Think of it as training your own, cheaper replacement. SWEs & their line managers develop & test AI workflows, while giving the bosses time to evaluate AI capabilities, then hopefully shrink the headcount as close to 0 as possible without shrinking profits. Right now, it's juniors who're getting squeezed.


I don't think bosses are smart enough to pull this off.


Increasing profits by reducing the cost of doing business isn't a complicated scheme. It's been done thousands of times, over many decades; first with cheaper contractors replacing full-time staff, then offshore labor, and now they are attempting to use AI.


It's not complicated because "bosses" accomplish this by saying "let's reduce the cost of doing business" to someone who actually does whatever is needed.

The value of a boss/manager when there's no employees to do the work is negative. AI won't change this.


> "bosses" accomplish this by saying "let's reduce the cost of doing business" to someone who actually does whatever is needed

I don't know where you work, but in my experience, headcount reductions are strictly a top-down exercise. At best, the bosses may ask line managers who the essential people on their teams are. At worst, they get handed a list of people to fire, and they themselves may get the boot right after.


Yeah, that's the whole point. "Some boss says reduce head count" and that's the amount of "work" done from that "boss".

Even the execution of "reducing head count" is done by everyone else.


My bosses aren't pushing it at all. The normal cargo-cult temptations have pulled on some fellow SWEs, but its being pretty effectively pushed back on by its own failings, paired with SWEs who use it being outperformed by those who dont.

> edit for spelling


I think the question is whether those ai tools make you produce more value. Anecdotally, the ai tools have changed the workflow and allowed me to produce more tools etc.

They have not necessarily changed the rate at which I produce valuable outputs (yet).


can you say more about this? what do you mean when you say 'more tools' is not the same as 'valuable outputs'


There are a thousand "nuisance" problems which matter to me and me alone. AI allows me to bang these out faster, and put nice UIs on it. When I'm making an internal tool - there really is no reason not to put a high quality UX on top. The high quality UX, or existence of a tool that only I use does not mean my value went up - just that I can do work that my boss would otherwise tell me not to do.


personal increase in satisfaction (such as "work that my boss would otherwise tell me not to do") is valuable - even if only to you.

The fact is, value is produced when something can be produced at a fraction of the resources required previously, as long as the cost is borne by the person receiving the end result.


Under this definition, could any tool at all be considered to produce more value?


no - this is a lesson an engineer learns early on. The time spent making the tool may still dwarf the time savings you gain from the tool. I may make tools for problems that only ever occurred or will occur once. That single incident may have occurred before I made the tool.

This also makes it harder to prioritize work in an organization. If work is perceived as "cheap" then it's easy to demand teams prioritize features that will simply never be used. Or to polish single user experiences far beyond what is necessary.


One thing I learned from this is to disregard all attempts at prioritizing based on the output's expected value for the users/business.

We prioritize now based on time complexity and omg, it changes everything: if we have 10 easy bugfixes and one giant feature to do (random bad faith example), we do 5 bugfixes and half the feature within a month and have an enormous satisfaction output from the users who would never have accepted to do it that way in the first place . If we had listened, we would have done 75% of the features and zero bug fixes and have angry users/clients whining that we did nothing all month...

The time spent on dev stuff absolutely matters, and churning quick stuff quickly provides more joy to the people who pay us. It's a delicate balance.

As for AI, for now, it just wastes our time. Always craps out half correct stuff so we optimized our time by refusing to use it, and beat teams who do that way.


Do using the tools increase ROI?


When using AI to find faults in existing processes that is value creation (assuming they get fixed of course).


If you want to steal code, you can take it from GitHub and strip the license. That is what the Markov chains (https://arxiv.org/abs/2410.02724) do.

It's a code laundering machine. Software engineering has a higher number of people who have never created anything by themselves and have no issues with copyright infringement. Other professions still tend to take a broader view. Even unproductive people in other professions may have compunctions about stealing other people's work.


I think that's also because Claude Code (and LLMs) is built by engineers who think of their target audience as engineers; they can only think of the world through their own lenses.

Kind of how for the longest time, Google used to be best at finding solutions to programming problems and programming documentation: say, a Google built by librarians would have a totally different slant.

Perhaps that's why designers don't see it yet, no designers have built Claude's 'world-view'.


I'm curious if you could share something about custom agents. I love Claude Code and I'm trying to get it into more places in my workflow, so ideas like that would probably be useful.


I've been using Google ADK to create custom agents (fantastic SDK).

With subagents and A2A generally, you should be able to hook any of them into your preferred agentic interface


I’m struggling to see how somebody who’s looking for inspiration in using agents in their coding workflow would glean any value from this comment.


They asked about custom Agents, ADK is for building custom agents

(Agent SDK, not android)


If you read a little further in the article, the main point is _not_ that AI is useless. But rather than AGI god building, a regular technology. A valuable one, but not infinite growth.


> But rather than AGI god building, a regular technology. A valuable one, but not infinite growth.

AGI is a lot of things, a lot of ever moving targets, but it's never (under any sane definition) "infinite growth". That's already ASI territory / singularity and all that stuff. I see more and more people mixing the two, and arguing against ASI being a thing, when talking about AGI. "Human level competences" is AGI. Super-human, ever improving, infinite growth - that's ASI.

If and when we reach AGI is left for everyone to decide. I sometimes like to think about it this way: how many decades would you have to go back, and ask people from that time if what we have today is "AGI".


Sam Altman has been drumming[1] the ASI drum for a while now. I don't think it's a stretch to say that this is the vision he is selling.

[1] - https://ia.samaltman.com/#:~:text=we%20will%20have-,superint...


Once you have AGI, you can presumably automate AI R&D, and it seems to me that the recursive self-improvement that begets ASI isn't that far away from that point.


We already have AGI - it's called humans - and frankly it's no magic bullet for AI progress.

Meta just laid 600 of them off.

All this talk of AGI, ASI, super-intelligence, and recursive self-improvement etc is just undefined masturbatory pipe dreams.

For now it's all about LLMs and agents, and you will not see anything fundamentally new until this approach has been accepted as having reached the point of diminishing returns.

The snake oil salesmen will soon tell you that they've cracked continual learning, but it'll just be memory, and still won't be the AI intern that learns on the job.

Maybe in 5 years we'll see "AlphaThought" that does a better job of reasoning.


Humans aren't really being put to work upgrading the underlying design of their own brains, though. And 5 years is a blink of an eye. My five-year-old will barely even be turning ten years old by then.


Assuming the recursing self-improvement doesn't run into physical hardware limits.

Like we can theoretically build a spaceship that can accelerate to 99.9999% C - just a constant 1G accel engine with "enough fuel".

Of course the problem is that "enough fuel" = more mass than is available in our solar system.

ASI might have a similar problem.


Where are the products? This site and everywhere around the internet, on x, linkedin and so is full of crazy claims and I have yet to see a product that people need and that actually works. What I'm experiencing is a gigantic enshittification everywhere, Windows sucks, web apps are bloated, slow and uninteresting. Infrastructure goes down even with "memory safe rust" burning millions and millions of compute for scaffolding stupid stuff. Such a disappointment


I think chatGPT itself is an epic product, Cursor has insane growth and usage. I also think they are both over-hyped, have too much a valuation.


Citing AI software as the only examples of how AI benefits developing software, has a bit of a touch of self-help books describing how to attain success and fulfillment by taking the example of writing self-help books.

I don’t disagree that these are useful tools, by the way. I just haven’t seen any discernible uptick in general software quality and utility either, nor any economic uptick that should presumably follow from being able to develop software more efficiently.


I made 1500 USD speculating on NVidia earnings, that's economic uptick for me !


I agree with everyone else, where is the Microsoft Office competitor created by 2 geeks in a garage with Claude Code? Where is the Exchange replacement created by a company of 20 people?

There are many really lucrative markets that need a fresh approach, and AI doesn't seem to have caused a huge explosion of new software created by upstarts.

Or am I missing something? Where are the consumer facing software apps developed primarily with AI by smaller companies? I'm excluding big companies because in their case it's impossible to prove the productivity, the could be throwing more bodies at the problem and we'd never know.


> Office…Exchange

The challenge in competing with these products is not code. The challenge competing in lucrative markets that need a fresh approach is also generally not code. So I’m not sure that is a good metric to evaluate LLMs for code generation.


I think the point remains, if someone armed with Claude Code could whip out a feature complete clone of Microsoft Office over the weekend (and by all accounts, even a novice programmer could do this, because of the magnificent greatness of Claude), then why don't they just go ahead and do it? Maybe do a bunch of them: release one under GPL, one under MIT, one under BSD, and a few more sold as proprietary software. Wow, I mean, this should be trivial.


It makes development faster, but not infinitely fast. Faithfully reproducing complex 42-year-old software in one weekend is a stretch no matter how you slice it. Also, AI is cheap, but not free.

I could see it being doable by forking LibreOffice or Calligra Suite as a starting point, although even with AI assistance I'd imagine that it might take anyone not intimately familiar with both LibreOffice (or Calligra) and MS Office longer than a weekend to determine the full scope of the delta between them, much less implement that delta.

But you'd still need someone with sufficient skill (not a novice), maybe several hundred or thousand dollars to burn, and nothing better to do for some amount of time that's probably longer than a weekend. And then that person would need some sort of motivation or incentive to go through with the project. It's plausible, but not a given that this will happen just because useful agentic coding tools exist.


Pick a smaller but impactful project and have 2-3 people working full-time on it for 1 year. Either this tech is truly revolutionary and these 2-3 people are getting at least 50% more done, or it's marginal and what are we even talking about?


There could be many such cases, or maybe only a few. I'm easily a multiple more productive as a result of integrating AI into my workflows; but whether that's broadly the case across the industry, or will become the case as we collectively adapt in coming years, is essentially unfalsifiable.


Cool. So we established that it's not code alone that's needed, it's something else. This means that the people who already had that something else can now bootstrap the coding part much faster than ever before, spend less time looking for capable people, and truly focus on that other part.

So where are they?

We're not asking to evaluate LLM's for code. We're asking to evaluate them as product generators or improvers.


Ok lets ignore competing with them. When will AI just spit out a "home cooked" version of Office for me so I can toss the real thing in the trash where it belongs? One without the stuff I don't want? When will it be able to give me Word 95 running on my M4 Chip by just asking? If im going to lose my career I might as well get something that can give me any software that I could possibly want by just asking.

I can go to Wendys or I can make my own version of Wendys at home pretty easily with just a bit more time expended.

The cliff is still too high for software. I could go and write office from scratch or customize the shivers FOSS software out there but its not worth the time effort.


It's not that they failed to compete on other metrics, it's that they don't even have a product to fail to sell.


We had upstarts in the 80s, the 90s, the 2000s and the 2010s. Some game, some website, some social network, some mobile app that blew up. We had many. Not funded by billions.

So, where is that in the 2020s?

Yes, code is a detail (ideas too). It's a platform. It positions itself as the new thing. Does that platform allow upstarts? Or does it consolidate power?


Pick other examples, then.

We have superhuman coding (https://news.ycombinator.com/item?id=45977992), where are the superhuman coded major apps from small companies that would benefit most from these superhumans?

Heck, we have superhuman requirements gathering, superhuman marketing, superhuman almost all white collar work, so it should be even faster!


Fine, where's the slop then? I expected hundreds of scammy apps to show up imitating larger competitors to get a few bucks, but those aren't happening either. At least not any more than before AI.


It doesn’t matter what you think. Where’s all the data proving that AI is actually valuable? All we have are anecdotes and promises.


ChatGPT is... a chat with some "augmentation" feature aka outputting rich html responses, nothing new except the generative side. Cursor is a VSCode fork with a custom model and a very good autocomplete integration. Again where are the products? Where the heck is Windows without the bloat that works reliably before becoming totally agentic? And therefore idiotic since it doesn't work reliably


> IMHO the bleeding edge of what’s working well with LLMs is within software engineering because we’re building for ourselves, first.

the jury is still out on that...


Yeah, I'll gladly AI-gen code, but I still write docs by hand. Have yet to see one good AI generated doc, they're all garbage.


Incidentally, I just spent some time yesterday with Gemini and Grok writing a first draft of docs for a complex codebase. The end result is far more useful and complete than anything I could have possibly produced without AI in the same amount of time, and I didn't even have to learn Mermaid syntax to fill the docs with helpful visual aids.

Of course it's a collaborative process — you can't just tell it to document the code with no other information and expect it to one-shot exactly what you wanted — but I find that documentation is actually a major strength of LLMs.


That use case works, I meant writing designs


The AI docs are good enough for AIs, to throw them at agents without previous context.


Agree. I also wonder whether this helps account for why some people get great value from AI and some terrible value.


It can also be bad if you're writing code in a tech island, with an abysmal codebase, or with weak AI tooling


Did you read the essay? It never claimed that AI was useless, nor was the ultimate point of the article even about AI's utility—it was about the political and monetary power shifts it has enabled and their concomitant risks, along with the risks the technology might impose for society.

This ignorance or failure to address these aspects of the issue and solely focus on its utility in a vacuum is precisely the blinkered perspective that will enable the consolidations of power the essay is worried about...the people pushing this stuff are overjoyed that so few people seem to be paying any attention to the more significant shifts they are enacting (as the article states, land purchase, political/capital power accumulation, reduction of workforces and operating costs and labor power... the list goes on)


As trite as it is, it really is a skill issue still due to us not having properly figured out the UI. Claude Code and others are a step in the right direction but you still have to learn all of the secret motions and ceremony. Features like plan mode, compact, CLAUDE.md files, switching models, using images, including specific files, skills and MCPs are all attempts to improve the interface but nothing is completely figured out yet. You still need to do a lot of context engineering and know what resources, examples, docs and scope to use and how to orchestrate the aforementioned features to get good results. You also need to bring a lot of your own knowledge and tools like being fastidious with version control and being able to write extremely well defined specifications and tasks. In short, you need to be an expert in both software engineering as well as LLM driven development and even then it's easy to shoot yourself in the foot by making a small mistake.


That's because LLMs are optimally designed for tasks like coding, as well as other text-prediction tasks such as writing, editing, etc.

The mistake is to project the same level of productivity provided by LLMs in coding to all other areas of work.

The point of TFA is that LLMs are an excellent tool for particular aspects of work (coding being one of them), not a general intelligence tool that improves all aspects (as we're being sold).


This. Design tends to explore a latent space that isn't well documented. There is no Stack Overflow or Github for design. The closest we have are open sourced design systems like Material Design, and portfolio sites like Behance. These are not legible reference implementations for most use cases.

If LLMs only disrupt software engineering and content slop, the economy is going to undergo rapid changes. Every car wash will have a forward deployed engineer maintaining their mobile app, website, backend, and LLM-augmented customer service. That happens even if LLMs plateau in six months.


I disagree. I think, as software developers, we also mostly speak to other software developers, and we like to share around AI fail stories, so we are biased to think that AI works for swe better than other areas...

However, while I like using AI for software development, as also a middle-manager, it increased my output A TON because AI works better for virtually anything that's not software development.

Examples: Update Jira issues in bulk, write difficult responses and incident reports, understand a tool or system I'm not familiar with, analyse 30 projects to understand which of them have this particular problem, review tickets in bulk to see if they have anything missing that was mentioned in the solution design, and so on ... All sorts tasks that used to take hours, now take minutes.

This is in line with what I'm hearing from other people: My CFO is complaining daily about running out of tokens. My technical sales relative says it is now taking him minutes to create tech specs from requirements of his customers, while it used to take hours.

While devs are rightfully "meh" because they truly need to review every single line generated by AI and type-writing the code is not their bottleneck anyway. It is harder to realise the gains for them.


Can you show something built with those tools.

The only reply I have got to this question was: it created a sap script.


> IMHO the bleeding edge of what’s working well with LLMs is within software engineering because we’re building for ourselves, first.

How are we building _for_ ourselves when we literally automate away our jobs? This is probably one of the _worst_ things someone could do to me.


Software engineers been automating our own work since we built the first assembler. So far it's just made us more productive and valuable, because the demand for software has been effectively unlimited.

Maybe that will continue with AI, or maybe our long-standing habit will finally turn against us.


> Software engineers been automating our own work since we built the first assembler.

The declared goal of AI is to automated software engineering entirely. This is in no way comparable to building an assembler. So the question is mostly about whether or not this goal will be achieved.

Still, nobody is building these systems _for_ me. They're building them to replace me, because my living is too much for them to pay.


Automating away software engineering entirely is nothing new. It goes all the way back to BASIC and COBOL, and later visual programming tools, Microsoft Access, etc. There have been innumerable attempts to do somehow get by without need those pedantic and difficult programmers and all their annoying questions and nit picking.

But here's the thing: the hard part of programming was never really syntax, it was about having the clarity of thought and conceptual precision to build a system that normal humans find useful despite the fact they will never have the patience to understand let alone debug failures. Modern AI tools are just the next step to abstracting away syntax as a gatekeeper function, but the need for precise systemic thinking is as glaringly necessary as ever.

I won't say AI will never get there—it already surpasses human programmers in many of the mechanical and rote knowledge of programing language arcana—but it it still is orders of magnitude away from being able to produce a useful system when specified by someone who does not think like a programmer. Perhaps it will get there. But I think the barrier at that point will be the age old human need to have a throat to choke when things go sideways. Those in power know how to control and manipulate humans through well-understood incentives, and this applies all the way to the highest levels of leadership. No matter how smart or competent AI is, you can't just drop it into those scenarios. Business leaders can't replace human accountability with an SLA from OpenAI, it just doesn't work. Never say never I suppose, but I'd be willing to bet the wheels come off modern civilization long before the skillset of senior software engineers becomes obsolete.


> Modern AI tools are just the next step to abstracting away syntax as a gatekeeper function, but the need for precise systemic thinking is as glaringly necessary as ever.

Syntax is not a gatekeeper function. It’s exactly the means to describe the precise systemic thinking. When you’re creating a program, you’re creating a DSL for multiple subsystem, which you then integrate.

The subsystem can be abstract, but we usually define good software by how closely fitted the subsystem are to the problem at hand, meaning adjustments only need slight code alterations.

So viewing syntax as a gatekeeper is like viewing sheet music as a gatekeeper for playing music, or numbers and arithmetic as a gatekeeper for accounting.


The difference is that human language is a much more information-dense, higher-level abstraction than code. I can say "an async function that accepts a byte array, throws an error if it's not a valid PNG image with a 1:1 aspect ratio and resolution >= 100x100, resizes it to 100x100, uploads it to the S3 bucket env.IMAGE_BUCKET with a UUID as the file name, and retries on failure with exponential backoff up to a maximum of 100 attempts", and you'll have a pretty good idea of what I'm describing despite the smaller number of characters than equivalent code.

I can't directly compile that into instructions which will make a CPU do the thing, but for the purposes of describing that component of a system, it's at about the right level of abstraction to reasonably encode the expected behavior. Aside from choosing specific libraries/APIs, there's not much remaining depth to get into without bikeshedding; the solution space is sufficiently narrow that any conforming implementation will be functionally interchangeable.

AI is just laying bare that the hard part of building a system has always been the logic, not the code per se. Hypothetically, one can imagine that the average developer in the future might one day think of programming language syntax in the same way that an average web developer today thinks of assembly. As silly as this may sound today, maybe certain types of introductory courses or bootcamps would even stop teaching code, and focus more on concepts, prompt engineering, and developing/deploying with agentic tooling.

I don't know how much learning syntax really gatekeeps the field in practice, but it is something extra that needs to be learned, where in theory that same time could be spent learning some other aspect of programming. More significant is the hurdle of actually implementing syntax; turning requirements into code might be cognitively simple given sufficiently baked requirements, but it is at minimum time-consuming manual labor which not everyone is in a position to easily afford.


> and you know exactly what I'm describing.

I won't unless both you and I have a shared context which will tie each of these concept to a specific thing. You said "async function", and there's a lot of languages that don't have that concept. And what about the permissions of the s3 bucket, what's the initial time of the wait time? And what algorithm for the resizing? What if someone sent us a very big image (let say the maximum that the standard allows).

These are still logic questions that have not been addressed.

The thing is that general programming languages are general. We do have constructs like procedure/functions and class, that allows us for a more specialized notation, but that's a skill to acquire (like writing clear and informative text).

So in pseudo lisp, the code would be like

   (defun fn (bytes)
     (when-let\* ((png (byte2png bytes))
                 (valid (and (valid-png-p png)
                             (square-res-p png)))
                 (small-png (resize-image png))
                 (bucket (get-env "IMAGE_BUCKET"))
                 (filename (uuid)))
       (do-retry :backoff 'exp
                 (s3-upload bucket small-png))))
And in pseudo prolog

  square(P) :- width(P, W), height(P, H), W is H.
  validpng(P, X) :-  a whole list of clauses that parses X and build up P, square(P).
  resizepng(P) :- bigger(100,100, P), scale(100, 100, P).
  smallpng(P, X) :- validpng(P, X), resizepng(P).
  s3upload(P): env("IMAGE_BUCKET", B), s3_put(P, B, (exp_backoff(100))))
  fn(X) :-  smallpng(P, X), s3upload(P)
So what you've left is all the details. It's great if someone already have an library that already does the thing, and the functions has the same signature, but more often than not, there isn't something like that.

Code can be as highlevel as you want and very close to natural language. Where people spend time is the implementation of the lower level and dealing with all the failure modes.


Details like the language/stack and S3 configuration would presumably be somewhere else in the spec, not in the description of that particular function.

The fact that you're able to confidently take what I wrote and stretch it into pseudocode with zero deviation from my intended meaning proves my point.


To draft a spec like this, it would take more time and the same or more knowledge than to just write the code. And you still won’t have reliable results, without doing another lengthy pass to correct the generated code.

I can create a pseudocode because I know the relevant paradigm as well as how to design software. There’s no way you can have a novice draft pseudo-code like this because they can’t abstract well and discern intent behind abstractions.


I don't agree that it would take more time. Drafting detailed requirements like that to feed into coding agents is a big part of how I work nowadays, and the difference is night and day. I certainly didn't spend as much time typing that function description as I would have spent writing a functional version of it in any given language.

Collaborating with AI also speeds this up a lot. For example, it's much faster to have the AI write a code snippet involving a dependency/API and manually verify the code's correctness for inclusion in the spec than it is to read though documentation and write the same code by hand.

The feat of implementing that function based on my description is well within the capabilities of AI. Grok did it in under 30 seconds, and I don't see any obvious mistakes at first glance: https://grok.com/share/c2hhcmQtMw_fa68bae1-3436-404b-bf9e-09....


I don't have access to the grok sample you've shared (service not available in my region)

Reading the documentation is mostly for gotchas and understanding the subsystem you're going to incorporate in your software. You can not design something that will use GTK or sndio without understanding the core concepts of those technologies. And if you know the concepts, then I will say it's easier and faster to write the code than to write such specs.

As for finding samples, it's easy on the web. Especially with GitHub search. But these days, I often take a look at the source code of the library itself, because I often got questions that the documentation don't have the answer for. It's not about what the code I wrote may do (which is trivial to know) but what it cannot do at all.


Ah, weird, that's good to know. Well here's the code:

    import { env } from './env';
    import { v4 as uuidv4 } from 'uuid';
    import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';
    import sharp from 'sharp';

    async function retry<T>(fn: () => Promise<T>, maxAttempts: number): Promise<T> {
      let attempt = 1;
      while (true) {
        try {
          return await fn();
        } catch (error) {
          if (attempt >= maxAttempts) {
            throw error;
          }
          const delayMs = Math.pow(2, attempt - 1) * 100;
          await new Promise((resolve) => setTimeout(resolve, delayMs));
          attempt++;
        }
      }
    }

    export async function processAndUploadImage(s3: S3Client, imageData: Uint8Array): Promise<string> {
      let metadata;
      try {
        metadata = await sharp(imageData).metadata();
      } catch {
        throw new Error('Invalid image');
      }

      if (metadata.format !== 'png') {
        throw new Error('Not a PNG image');
      }

      if (!metadata.width || !metadata.height || metadata.width !== metadata.height || metadata.width < 100) {
        throw new Error('Image must have a 1:1 aspect ratio and resolution >= 100x100');
      }

      const resizedBuffer = await sharp(imageData).resize(100, 100).toBuffer();

      const key = `${uuidv4()}.png`;

      const command = new PutObjectCommand({
        Bucket: env.IMAGE_BUCKET,
        Key: key,
        Body: resizedBuffer,
        ContentType: 'image/png',
      });

      await retry(async () => {
        await s3.send(command);
      }, 100);

      return key;
    }
The prompting was the same as above, with the stipulations that it use TypeScript, import `env` from `./env`, and take the S3 client as the first function argument.

You still need reference information of some sort in order to use any API for the first time. Knowing common Node.js AWS SDK functions offhand might not be unusual, but that's just one example. I often review source code of libraries before using them as well, which isn't in any way contradictory with involving AI in the development process.

From my perspective, using AI is just like having a bunch of interns on speed at my beck and call 24/7 who don't mind being micromanaged. Maybe I'd prefer the end result of building the thing 100% solo if I had an infinite amount of time to do so, but given that time is scarce, vastly expanding the resources available to me in exchange for relinquishing some control over low-priority details is a fair trade. I'd rather deliver a better product with some quirks under the hood than let my (fast, but still human) coding speed be the bottleneck on what gets built. The AI may not write every last detail exactly the way I would, but neither do other humans.


As I’m saying, for pure samples and pseudo code demo, it can be fast enough. But why bring in the whole s3 library if you’re going to use one single endpoint? I’ve checked npmjs and sharp is still in beta mode (if they’re using semver). Also, the code is parsing the imagedata twice.

I’m not saying that I write flawless code, but I’m more for less feature and better code. I’ve battled code where people would add big libraries just to not write ten lines of code. And then can’t reason when a snippet fails because it’s unreliable code into unreliable code. And then after a few months, you got zombie code in the project. And the same thing implemented multiple times in a slightly different way each time. These are pitfalls that occur when you don’t have an holistic view of the project.

I’ve never found coding speed to be an issue. The only time when my coding is slow is when I’m rewriting some legacy code and pausing every two lines to decipher the intent with no documentation.

But I do use advanced editing tools. Coding speed is very much not a bottleneck in Emacs. And I had a somewhat similar config for Vim. Things like quick access to docs, quick navigation (thing like running a lint program and then navigating directly to each error), quick commit, quick blaming and time traveling through the code history,…


> But why bring in the whole s3 library if you’re going to use one single endpoint?

This is a bit of a reach. There's no reason to assume that the entire project would only be using one endpoint, or that AI would have any trouble coding against the REST API instead if instructed to. Using the official SDK is a safe default in the absence of a specific reason or instruction not to.

Either way, we're already past the point of demonstrating that AI is perfectly capable of writing correct pseudocode based on my description.

> Coding speed is very much not a bottleneck in Emacs.

Of course it is. No editor is going to make your mind and fingers fast enough to emit an arbitrarily large amount of useful code in 0 seconds, and any time you spend writing code is time you're not spending on other things. Working with AI can be a lot harder because the AI is doing the easy parts while you're multitasking on all the things it can't do, but in exchange you can be a lot more productive.

Of course you still need to have enough participation in the process to be able to maintain ownership of the task and be confident in what you're committing. If you don't have a holistic view of the project and just YOLO AI-generated code that you've never looked at into production, you're probably going to have a bad time, but I would say the same thing about intern-generated code.

> I’m more for less feature and better code.

Well that's part of the issue I'm raising. If you're at the point of pushing back on business requirements in the interest of code quality, that's just another way of saying that coding speed is a bottleneck. Using AI doesn't only help with rapidly pumping out more features; it's an extremely useful tool for fixing bugs at a faster pace.


Just to conclude the thread on my side.

IMO, useful code is code in production (or if it’s for myself, something I can run reliably). Anything else is experimentation. If you’re working in a team, code shared with others are proposal/demo level.

Experimentation is nice for learning purpose. Kinda like scratch notes and manuscripts in the writing process. But then, it’s the editing phase when you’re stamping out bugs, with tools like static analysis, automated testing, and manual qa. The whole goal is to have the feature in the hand of the users. Then there’s the errata phase for errors that have slipped trough.

But the thing is code is just a static representation of a very dynamic medium, the process. And a process have a lot of layers. The code is usually a small part of the whole. For the whole thing to be consistent, parts need to be consistent with each other, and that’s when contract cames into place.The thing with generated AI code is that they don’t respect contracts. Because of their nature (non deterministic) and the fact that the code (which is the most faithful representation of the contracts can be contradictory (which leads to bugs).

It’s very easy to write optimistic code. But as the contracts (or constraints) in the system grew in number, they can be tricky to balance. The rescourse is always to go up a level in abstraction. Make the subsystems blackboxes and consider only their interactions. This assumes that the subsystems are consistent in themselves.

Code is not the lowest level of abstraction, but it’s often correct to assume that the language itself is consistent. Then it’s the libraries and the quality varies. Then it’s the framework and often it’s all good until it’s not. Then it’s your code and that’s very much a mistery.

All of this to say that writing code is the same as writing words on a manuscript to produce a book. It’s useful but only if it’s part of the final product or help in creating it. Especially if it’s not increasing the technical debt exponentially.

I don’t work with AI tools because by the time I’m ok with the result, more time have been spent than if I’ve done the thing without. And the process is not even enjoyable.


Of course; no one said anything about experimentation. Production code is what we're talking about.

If what you're saying is that your current experience involves a lot of process and friction to get small changes approved, that seems like a reasonable use case for hand-coding. I still prefer to make changes by hand myself when they're small and specific enough that explaining the change in English would be more work than directly making the change.

Even then, if there's any incentive to help the organization move more quickly, and there's no policy against AI usage, I'd give it a shot during the pre-coding stages. It costs almost nothing to open up Cursor's "Ask" mode and bounce your ideas off of Gemini or have it investigate the root cause of a bug.

What I typically do is have Gemini perform a broad initial investigation and describe its findings and suggestions with a list of relevant files, then throw all that into a Grok chat for a deeper investigation. (Grok is really strong at analysis in general, but its superpower seems to be a willingness to churn on sufficiently complex problems for as long as 5+ minutes in order to find the right answer.) I'll often have a bunch of Cursor agents and Grok chats going in parallel — bouncing between different bug investigations, enhancement plans, and one or two code reviews and QA tests of actual changes. Most of the time that AI saves isn't the act of emitting characters in and of itself.


Who declared it? Who cares what anyone declares? What do you think will actually happen? If software can be fully automated, then sure SWEs will need to find a new job. But why wouldn't it increase productivity instead and there still are developer jobs, just different.


> The declared goal of AI is to automated software engineering entirely.

Its hardly the first thing that has that as its “declared goal” (i.e., the fantasy sold by to investors to get capital and the media to get attention.)


This is kind of a myopic view of what it means to be a programmer.

If you're just in it to collect a salary, then yeah, maybe you do benefit from delivering the minimum possible productivity that won't get you fired.

But if you like making computers do things, and you get joy from making computers do more and new things, then LLMs that can write programs are a fantastic gift.


> But if you like making computers do things, and you get joy from making computers do more and new things, then LLMs that can write programs are a fantastic gift.

Maybe currently if you enjoy social engineering an LLM more than writing stuff yourself. Feels a bit like saying "if you like running, you'll love cars!"

In the future when the whole process is automated you won't be needed to make the computer do stuff, so it won't matter whether you would like it. You'll have another job. Likely one that pays less and is harter on your body.


Some people like running, and some people like traveling. Running is a fine hobby, but I'm still glad that planes exist.

Maybe some future version of agentic tooling will decimate software engineering as a career path, but that's just another way of saying that everyone and their grandmother would suddenly have the ability to launch a tech startup. Having gone through fundraising in the past, I'd personally prefer to live in a world where anyone with a good idea could get access to the equivalent of a full dev team without raising a dime.


You're still focusing on "programming as a job" being fundamental to programming, and I'm saying it's not.


But you're not making the computer do things, you're making an idea for a new thing a computer can do and then outsourcing the part of the "making it do things" that is actually fun and fulfilling. I don't get it -- the joy for me comes from learning and problem solving, not coming up with ideas and then communicating those ideas to a tool that can do the rest of the job for me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: