> it is sad to me just how much people are trying to automate away programming and delegating it to a black box
I take it you're not using a compiler to generate machine code, then?
Scratch that, I guess you're not using a modern microprocessor to generate microcode from a higher-level instruction set either?
Wait, real programm^Wartists use a magnetised needle and a steady hand.
Programming has always been about finding the next black box that is both powerful and flexible. You might be happy with the level of abstraction you have settled on, but it's just as arbitrary as any other level.
Even the Apollo spacecraft programmers at MIT had a black box: they offloaded the weaving of core rope memory to other people. Programming is not necessarily about manually doing the repetitive stuff. In some sense, I'd argue that's antithetical to programming -- even if it makes you feel artistic!
The thing is, all these stacks are built by people and verified against specifications. When they failed to perform the way they should, we fixed them.
Plus, all the parts are deterministic in this stack. Their behavior is fixed for a given input, and all the parts are interpretable, readable, verifiable and observable.
LLMs are none of that. They are stochastic probability machines, which are nondeterministic. We can't guarantee their output's correctness, and we can't fix them to guarantee correct output. They are built on tons of (unethically sourced) data, which has no correctness and quality guarantees.
Some people will love LLMs, and/or see programming as a task/burden they have to complete. Some of us love programming for the sake of it, and earn money by doing it that way, too.
So putting LLMs to the same bucket with a deterministic, task specific programming tool is both wrong, and disservice to both.
I'm also strongly against LLMs, not because of the tech, but because of how they are trained and how their shortcomings are hid and they're put forward as the "oh the savior of the woeful masses, and the silver bullet of all thy problems", and it's neither of them.
LLMs are just glorified tech demos which shows what stochastic parrots can pose as accomplishing when you feed the whole world to them.
I would argue thats an LLM spec --> Generate probabilistic output with a degree of confidence on the output nearing p(1). IMHO End users are supposed to not take the output of these machines as is but rather iterate on top and finish their task in lesser time.
We can (for programming, at least): run the output thru theorem prover, ensure that proof is constructive, the Curry-Howard correspondence guarantees that you can turn the output into a correct program. It doesn't guarantee that formal properties of the program correspond to the informal problem statement. But even people occasionally make such errors (a provably correct program doesn't do what we wanted it to do).
> and we can't fix them to guarantee correct output
Same thing with other systems capable of programming. That is people.
You just can't make a system that guarantees correct transformation from an informal problem statement into a formally correct implementation. "Informal" implies that there's wiggle room for interpretation.
No, it doesn't mean that current LLMs are ready to replace programmers, it also doesn't mean that ML models of 2030s will not be able to.
Indeterminism is not always bad. A probabilistic Turing machine is more powerful than a Turing machine, for example (BPP complexity class is a superset of P).
>> We can (for programming, at least): run the output thru theorem prover, ensure that proof is constructive, the Curry-Howard correspondence guarantees that you can turn the output into a correct program. It doesn't guarantee that formal properties of the program correspond to the informal problem statement. But even people occasionally make such errors (a provably correct program doesn't do what we wanted it to do).
That sounds very ambitious. Automated theorem provers are real sticklers for complete specifications in a formal language and can't parse natural language at all, but when you generate code with an LLM all you have in terms of a specification is a natural language prompt (that's your "informal problem statement"). In that case what exactly is the prover going to prove? Not the natural language prompt it can't parse!
The best you can do if you start with a natural language specification, like an LLM prompt, is to verify that the generated program compiles, i.e. that it is correct syntactically. As to semantic correctness, there, you're on your own.
Edit: I'm not really sure whether you're talking about syntactic or semantic correctness after all. Which one do you mean?
>> You just can't make a system that guarantees correct transformation from an informal problem statement into a formally correct implementation. "Informal" implies that there's wiggle room for interpretation.
Note that in program synthesis we usually make a distinction between complete and incomplete specifications ("problem statements") not formal and informal. An incomplete specification may still be given in a formal language. And, for the record, yes, you can make a system that guarantees that an output program is formally consistent with an incomplete specification. There exist systems like that already. You can find a bit about this online if you search for "inductive program synthesis" but the subject is spread over a wide literature spanning many fields so it's not easy to get a clear idea about it. But, in general, it works and there are approaches that give you strong theoretical guarantees of semantic correctness.
Ah, I said "theorem prover", I should have said "proof verifier". What I meant is something like DeepMind's AlphaProof with an additional step of generating a formal specification from a natural language description of the problem. In this way we get a semantically correct program wrt the formal specification. But with current generation of LLMs we probably won't get anything for non-trivial problems (the LLM won't be able to generate a valid proof).
> Note that in program synthesis we usually make a distinction between complete and incomplete specifications
Program synthesis begins after you can coherently express an idea of what you want to do. And getting to this point might involve a ton of reasoning that will not go into a program synthesis pipeline. That's what I mean when I say an "informal problem statement": some brain dumps of half-baked ideas that doesn't even constitute an incomplete specification because they are self-contradictory (but you haven't noticed it yet).
LLMs can help here by trying to generate some specification based on a brain dump.
>> What I meant is something like DeepMind's AlphaProof with an additional step of generating a formal specification from a natural language description of the problem.
That's even more ambitious. From DeepMind's post on AlphaProof:
First, the problems were manually translated into formal mathematical language for our systems to understand.
DeepMind had to resort to this manual translation because LLMs are not reliable enough, and natural language is not precise enough, to declare a complete specification of a formal statement, like a program or a mathematical problem (as in AlphaProof) at least not easily.
I think you point that out in the rest of your comment but you say "the LLM won't be able to generate a valid proof" where I think you meant to say "a valid specification". Did I misunderstand?
>> Program synthesis begins after you can coherently express an idea of what you want to do.
That's not exactly right. There are two kinds of program synthesis. Deductive program synthesis is when you have a complete specification in a formal language and you basically translate it to another language, just like with a compiler. That's when you "coherently express an idea of what to do". Inductive program synthesis is when you have an incomplete specification, consisting of examples of program behaviour, usually in the form of example pairs of the inputs and outputs of the target program, but sometimes program traces (like debug logs), abstract syntax trees, program schemas (a kind of rich program template) etc.
Input-output examples are the simplest case. Today, if you can express your problem in terms of input-output examples there are approaches that can synthesize a program that is consistent with the examples. You don't even need to know how to write that program yourself.
> where I think you meant to say "a valid specification". Did I misunderstand?
What do you mean when you say "a valid specification"? There are known algorithms to check validity of a proof. How do you check that specification is valid? People are inspecting it and agree that "yes, it seems to be expressing what was intended to be expressed in the natural language", or, "no, this turn of phrase needs to be understood in a different way" or some such. Today there's no other system that can handle this kind of a task besides humans (who are fallible) and LLMs (that are much more fallible).
That is deciding that specification is valid cannot be done without human involvement. I left that part out and focused on what we can mechanistically check (that is validity of a proof).
So, no, I didn't mean "a valid specification". And, yes, I don't think that today's LLMs would be good at producing specifications that would be deemed valid by a consensus of experts.
> Today, if you can express your problem in terms of input-output examples there are approaches that can synthesize a program that is consistent with the examples
In a limited domain with agreed-upon rules of generalization? Sure. In general? No way. The problem of generalizing from a limited number of examples with no additional restrictions is ill-defined.
And the problem "generalize as an expert would do" is in the domain of AI.
> "oh the savior of the woeful masses, and the silver bullet of all thy problems"
Who said that? Everyone I've talked to warns about their shortcomings (including their creators) and even the platform where I use them has a warning plastered right under the input box saying "ChatGPT can make mistakes. Check important info."
>> I take it you're not using a compiler to generate machine code, then?
The dismissive glibness of your comment makes me wonder if it's worth it trying to point out the obvious error in the analogy you're making. Compilers translate, LLMs generate. They are two completely different things.
When you write a program in a high-level language and pass it to a compiler, the compiler translates your program to machine code, yes. But when you prompt an LLM to generate code, what are you translating? You can pretend that you are "translating natural language to code" but LLMs are not translators, they're generators, and what you're really doing is providing a prefix for the generated string. You can generate strings form an LLM with an empty prefix; but try asking a compiler to compile an empty program.
>> Even the Apollo spacecraft programmers at MIT had a black box: they offloaded the weaving of core rope memory to other people.
There is no "black box" here. Programmers created the program and handed it over to others to code it up. That's like hiring someone to type your code for you at a keyboard, following your instructions to do so. You have to stretch things very far to see this as anything like compilation.
Also, really, compilers are not black boxes. Just because most people treat them as a scary unknowable thing doesn't mean that's what they are. LLms are "black boxes" because no matter how much we peer at their weights, arrays of numerical values, there's nothing we can ... er ... glean from them. They're incomprehensible to humans. Not so the code of a compiler. Even raw binary is comprehensible, with some experience.
I recently had used an LLM to convert a lot of Python to Rust. It got it 99% right, and it took me a short while to fix the compile-time errors, and carefully check the tests weren't broken (as I trusted the code worked when the tests passed).
Is that "compiling" or "translating"? Lots of people use language to C "compilers".
I get what you're trying to say, but I don't entirely agree. Raising levels of abstraction is generally a good thing. But up until now, those have mostly been deterministic. We can be mostly confident that the compiler will generate correct machine code based on correct source code. We can be mostly confident that the magnetised needle does the right thing.
I don't think this is true for LLMs. Their output is not deterministic (up for discussion). Their weights and the sources thereof are mostly unknown to us. We cannot really be confident that an LLM will produce correct output based on correct input.
I agree with you but I want to try to define the language better.
It's not that LLMs aren't deterministic, because neither are many compilers.
It's also not that LLMs produce incorrect output, because compilers do that to, sometimes.
But when a compiler produces the wrong output, it's because either (1) there's a logic error in my code, or (2) there's a logic error in the compiler†, and I can drill down and figure out what's going on (or enlist someone to help me) to fix the problem.
Let's say I tell an LLM to write a algorithm, and it produces broken code. Why didn't my prompt work? How do I fix it? Can anyone ever actually know? And what did I learn from the experience?
---
† Or I guess there could be a hardware bug. Whatever. I'm going to blame the compiler because it needs to produce bytes that work on my silicon regardless of whether the silicon makes sense.
This is in general only true for either trivial toy compilers or ones which have gone to lengths to have reproducible builds. GCC for instance uses a randomised branch prediction model in some circumstances.
Ok, but my understanding is that that they are mostly deterministic. And that there are initiatives like Reproducible Builds (https://reproducible-builds.org) that try to move even more in that direction.
But what does "mostly" mean? You can compile the same code twice and literally get two different binaries. The bits don't match.
Sure, those collections of bits tend to do exactly the same thing when executed, but that's is in some sense a subjective evaluation.
---
Szundi said in a sibling comment that I was "completely [missing] the point on purpose" by bringing up compiler determinism. I think that's fair, but it's also why I opened my post by saying "I agree [with the parent], but I want to try to define the language better." Most compilers in use today are literally not deterministic, but they are deterministic in a different sense, which is useful as a comparison point to LLMs. Well, which sense? What is the fundamental quality that makes a compiler more predictable?
I'd like to try to find the correct words, because I don't think we have them yet.
I'm not an compiler expert, not by far. But my understanding is that if you compile the same code on the same machine for the same target, you'll get the same bits. Only minor things like timestamps that are sometimes introduced might differ. In this sense, maybe they are not deterministic. But I think it's fair to classify them as "determinstic" compared to LLMs.
I’d say it’s not only determinism, but also the social contract that’s missing.
When I’m calling ‘getFirstChar’ from a library, me and the author have a good understanding of what the function does based on a shared context of common solutions in the domain we’re working in.
When you ask ChatGPT to write a function that does the same, your social contract is between you and untold billions of documents that you hope the algorithm weights correctly according to your prompt (we should probably avoid programming by hope).
You could probably get around this by training on your codebase as the corpus, but until we answer all the questions about what that entails it remains, well, questionable.
I use Cursor at work, which is basically VSCode + LLM for code generation. It's a guess and check, basically. Plenty of people look up StackOverflow answers to their problem, then verify that the answer does what they want. (Some people don't verify but those people are probably not good programmers I guess.) Well, sometimes I get the LLM to complete something, then verify the code is completed is what I would have written (and correct it if not). This saves time/typing for me in the long run even if I have to correct it at times. And I don't see anything wrong with this. I'm not programming by hope, I'm just saving time.
This increases the time you spend proofing other’s work (tedious) versus time you spend developing a solution in code (fun). Also, if the LLM output is correct 95% of the time, one tends to get more sloppy with the checking, as it will feel unnecessary most of the time.
> This increases the time you spend proofing other’s work (tedious) versus time you spend developing a solution in code (fun).
I find that I don't use it as much for generating code as I do for automating tedious operations. For example, moving a bunch of repeating-yourself into a function, then converting the repeating blocks into function calls. The LLM's really good at doing that quickly without requiring me to perform dozens of copy-paste operations, or a bunch of multi-cursor-fu.
Also, I don't use it to generate large blocks of code or complicated logic.
Just what I was thinking about lately, what if LLMs are not 95% precise, but 99,95%. After like 50-100 checks you find nothing, and you just dump the whole project to be implemented - and there come the bugs.
However ... your colleagues just do the same.
We'll see how this unfolds. As for now the industry seems to be a bit stuck at this level. Big models too expensive to train for marginal gains, smaller are getting better but doesn't help this. Until some one new idea comes in how LLMs should work, we won't see the 99.95% anyway.
one idea is obvious: multi-model approach. it partially done today for safety checks. the same can be done for correctness. one model produces result, different model only checks the correctness. optionally several results, second model checks correctness and selects the best. this is more expensive, but should give better final output. not sure, this may have been already done.
> We can be mostly confident that the compiler will generate correct machine code based on correct source code.
Recently got email about gcc 14.2, they fixed some bugs in it. Can we trust it now, these could be the last bugs. But before that it was probably a bad idea to trust. No, even compiler's output requires extensive testing. Usually it's done at once, just final result of coding and compilation.
> Their output is not deterministic
yes.
> Their weights and the sources thereof are mostly unknown to us
Some of them are known. Does it make you feel better. There are too many weights, so you are not able to track its 'thinking' anyway. There are some tools which sort of show something. Still doesn't help much.
> We cannot really be confident that an LLM will produce correct output based on correct input
No, we can't. But it's so useful when it works. I'm using it regularly for small utilities and fun pictures. Even though it can give outright wrong answers for relatively simple math questions. With explanations and full confidence.
For the average programmer the infinite layers of abstraction, libraries and middleware isn't deterministic either. The LLMs actually honest to god being probabilistic estimators doesn't change anything about what they produce or how they see their own stuff.
> We cannot really be confident that an LLM will produce correct output based on correct input.
There are 2 things at play here, one is LLM with human in the loop, in which it's just a tool for programmers to do the same thing they have been doing, and the other is LLM as black box automaton. For the former, it's not a problem that the tool is undeterministic, we are double checking the results and add our manual labour anyway. The fact that a tool can fail sometimes is an unsurprising fact of engineering.
I think the criticism in this chain of comment applies more to the latter, but even it always has values to non-tech people, just like how no-code approaches are, however shitty it looks to us software enfineers.
I don't know. Programming with an LLM turns every line of code into legacy code that you have to maintain and debug and don't fully grok because you didn't write it yourself.
If it's in your PR then you wrote it, no one should be approving code they do not understand whether that's from AI or googling. Nothing changes there.
Most frequent output does not imply correctness, LLMs often are confidently wrong.
They can't even perform basic arithmetic (which is not surprising since they operate at the syntactic level, oblivious to any semantic rules), yet people seem to think offloading more complex tasks with strict correctness requirements is a good idea. Boggles the mind tbh.
They aren't saying they're the same, I'm not sure how you got that interpretation. It's very clear they're highlighting the hypocrisy that arises from claiming to be against automating away aspects of programming while relying on tools that do exactly that for you - only being Ok with it as long as they aren't called "AI".
The crux of why this is a bad analogy is that everyone talking about "automating" things with LLMs is misusing the word "automation". A machine can automate a repetitive manual task. A computer can automate the operation of machinery. A machine instruction set is an abstraction on top of circuitry that can automate the labor of extrapolating the logic physically executed by that circuitry into human-comprehensible routines. In the same way, a programming language implementation (e.g. a compiler) can somewhat automate programming (in the sense that it uses higher levels of abstraction to describe the same thing, saving labor while keeping determinism). What do these things have in common? We can reliably make them approach deterministic behavior. In the case of compilers, completely and reliably and transparently so. Just because you haven't bothered to read what a compiler is doing doesn't mean someone can't verify what it's doing. Physical machines are less reliable, but we have reliable ways to test them, reliable error margins, reliable failure modes, reliable variance. When you are on a stack of abstractions like a programming language on top of a compiler on top of transistors on top of a machine, an error at the top of that stack can have a lot of implications. A tool that probabilistically generates code is not automation. We have no guarantees about how and when it will get things wrong, how much this will happen, and what kinds of things it will get wrong. We have no way to audit their results that will generalize to every problem upstream of them. We have no way to reliably measure improvement in consistency, let alone improve that margin of error reliably. The entire idea that this is an automation at all is nonsense.
how do you figure? people were making directly analogous arguments about compilers back in the day. (not trying to argue that they are 'the same', but their is definitely a spectrum of code generation methods, with widely varying genres of guarantees, suiting a widely varying range of use cases)
I get the point that they are in different magnitudes of unknown but the analogy is still pretty good when it comes to the median programmer, who has no idea what goes on within either one. And if you argue that compilers are ultimately deterministic, that same argument technically holds for an LLM as well.
The biggest difference to me is that we have humans that claim they can explain why compilers work the way they do. But I might as well trust someone who says the same about LLMs, because honestly I have no way to verify if they speak the truth. So I am already offloading a lot of burden of proof about the systems I work on to others. And why does this ”other” need to be a human.
This is like saying “I don’t understand how airplanes fly, so I’ll happily board an airplane designed by an LLM. The reality is determined by how much I know about it.”
No, the other way around. I am saying it is not a smart take to say ”a safe airplane cannot be built if LLMs were used in the process in any way, because reasons”. The safety of the airplane (or more generally the outcome of any venture) can be measured in other ways than leaning on some rule that you cannot use an LLM for help at any stage because they are not always correct
Thank you for saying this. It's always baffled me that people will decry ChangeX as unnatural and wrong when it happens in their lifetime, but happily build their lives upon NearlyIdenticalChangeY so long as it came before them.
I don't think that this is a fair comparison because at some point the nature of the craft actually does change.
To give an analogy, a carpenter might be happy with hand tools, happy with machine tools, happy with plywood, and happy with MDF. For routine jobs they may be happy to buy pre-fabbed cabinets.
But for them to employ an apprentice (AI in this example) and outsource work to them - suddenly they are no longer really acting as a carpenter, but a kind of project manager.
edit: I agree that LLMs in their current state don't really fundamentally change the game - the point I am trying to make is that it's completely understandable that everyone has their own "stop" point. Otherwise, we'd all live in IKEA mansions.
Running state of the art LLMs for programming is nowhere near project management. At least in my experience, all LLMs are really good at is dumping plausible tokens quickly. They can't think, design, or handle tradeoffs intelligently.
They help me with the keyboard work, not any of the actual programming.
An apprentice is another person performing the same kind of work as the carpenter. That's fundamentally different from using an LLM, which is not a person and does not function like a person.
Whether you think LLMs are spectacularly worthwhile or odious and destructive, it's crucial not to classify them as being a person instead of a software tool.
This is a very poor analogy. It's not a matter of abstractions, it's a matter of getting someone or something else to do the work, while you mostly watch and fix any errors you're able to catch.
This is a qualitatively different kind of abstraction. All other abstractions still require the programmer to express the solution in a formal language, while LLMs are allowing the user to express the solution in natural language. It's no longer programming, but much more like talking to a programmer as a manager.
You can learn how compilers work and understand how they do what they do. Nobody understands what’s in those billions of parameters, and no one ever will.
why are there so many parameters in the first place? and was it humans who generated so many? seems like a very big job for a human to do, or even a team of humans to do.
disclaimer: I know next to nothing about llms. and I'm not that interested to learn about them. just asking casually.
> why are there so many parameters in the first place?
Because parsing and writing human language in a natural way is extremely complex.
> and was it humans who generated so many?
No, it is generated using an algorithm that tries to predict the next word in human written text using the words that comes before it. It ingests basically all the text on the internet to do this, without that much text the LLM performs horribly.
They're not manually generated or anything, it's just a setting. Too few and the model doesn't have enough flexibility to capture complex patterns. Too many and the model can just memorize the data you train it on rather than capturing the patterns driving it.
The question is about the abstraction being understandable and predictable. All the examples you have follow that, LLMs throw that out of the window.
>> Scratch that, I guess you're not using a modern microprocessor to generate microcode from a higher-level instruction set either?
Hell, I design gate level logic -> map it to instructions -> use them in C for the very LLMs and can fully understand[0] every aspect of it (if it doesn't behave as expected, that is a bug) but I cannot fathom or predict how the LLMs behave when i use them even though I know their architecture and implementation.
[0] Admittedly I treat the tools I use during the process, like cad tools, compiler, as black boxes, however I know that if I want to or the need arises, I can debug/understand them.
Which would be relevant if either side respected it.
In practice, compilers frequently have bugs and programmers even more frequently make use of "what the compiler actually does" rather than adhering to the language specification -- to the point where the de facto spec for many languages is "what the canonical implementation does".
The frequency with which compiler implementations functionally diverge from language specifications is dwarfed, by many orders of magnitude, by the frequency with which LLMs generate provably nonsensical code in response to a prompt.
To wit, a compiler diverging from the specification is so relatively rare that people will get angry about it and demand that it be fixed, while an LLM spewing creative nonsense is so accepted and par for the course that complaining about that fact is met with a shrug and "well, what did you expect?"
LLMs don’t have a canonical output that could serve as a specification. And if they had, we wouldn’t consider that a satisfactory specification at all.
I guess we are currently in the special situation, that we as human programmers can understand the output of a coding LLM. That's because programming languages are designed to be human readable. And we had an incentive to learn those languages.
I imagine that machine learning powered coding will evolve to an even blacker box, than it is today: It will transform requirements to CPU instructions (or GPU instructions, netlists, ...). Why bother to follow those indirections, that are just convenience layers for those weak carbon units (urgh)?
Simultaneously, automation will likely lead to fewer skilled programmers in the future, because there will be fewer incentives to become one.
Together those effects could lead to a situation where we are condemned to just watch.
Reminds me long ago of working with a guy that did everything in C when we were rewriting things in Perl . Yes, his stuff was faster. Yes, it was also buggier, harder to debug, and took 3x as long to write for similar levels of functionality (it wasn't speed dependent code by any stretch.)
Actually the opacity of the abstraction layer is the core of the issue. First we note that opacity is a measure in both the inner-workings of the 'box' and orthogonally (in context of LLMs) a measure of deterministic outcome.
Programming, it is asserted by some of us, is exactly the act of instructing a deterministic 'black' box.
Peer-coding with an LLM is the act of cajoling a mechanism to hopefully consistently produce the input to a sensible "blackbox". It is not programming, it is getting help. Now if the help was 100% reliable, we could discuss programming the helper.
The other day I had a vision of the future AI whisperer in the corporate setting. They wear capes of varied colors and possibly sport a wand. "It's an art you see".
By that light, the artist that is painting using store-bought pigments instead of hand made, or brushing paints instead of reallocating molecules, or even better, atoms, is also using a black box.
I think OP has a point, and is about guiding the design, then overall structure and dynamic of the code. Nobody expects to write in Assembler or not using libraries, but making a concious decision on the design.
I have met few programmers, but many coders. For them coding is a job, and generally they don’t care about overall architecture, algorithm efficiency, and code elegance is limited to syntax-coloring-themes in their editor. I respect their existence, but generally they are building on top off somobody else’s effort.
The transformations you're referring to are fully deterministic and guaranteed to be correct.
LLMs provide statistically probable answers with no guarantee of correctness. It takes more time to review LLM code than it takes to write it correctly from scratch.
We can argue the same thing for artists. Who wrote the algorithms for your favorite Photo/Image editor? Who created the image formats and standards, infrastructures for you to be able to push binary files to millions of people?
Your argument is a false dichotomy and thus a logical fallacy. Not all steps forward in abstraction or code generation are necessary good steps and have to be considered on their own merits. If LLMs are indeed superior, then you should be able to articulate their merits without condescending to fallacious attacks.
HN commenter: Samuel, why don't you use an LLM to write this play you are working on?
Beckett: What?
HN commenter: Well it's just like when you decided to work in French instead of English. Your art was no less because of it. Now you can use an LLM instead of French. It will be so much quicker.
While the base of your argument is true, it’s also a bit dishonest. LLMs are significantly different than any of these other abstractions because they can’t be reasoned about or meaningfully examined/debugged. They’re also the first of these advances which anyone has claimed would eliminate the need for programmers at all. I don’t believe the C compiler was meant to do my whole job for me.
Cobol and other early high-level languages were designed with the intention of allowing businesspeople to write their own programs so programmers wouldn't be needed. Some people really believed that!
I'd really like to have everything written in Rust, not C. Rust does a lot of verification, verification that is very hard to understand. I'd like to be able to specify a function with a bunch of invariants about the inputs and outputs and have a computer come up with some memory-safe code that satisfies all those invariants and is very optimized, and also have a list of alternative algorithms (maybe you discard this invariant and you can make it O(nLog(n)) instead of O(n^2), maybe you can make it linear in memory and constant in time or vice versa...)
Maybe you can't examine what the LLM is doing, but as things get more advanced we can generate code to do things, and also have it generate executable formal proofs that the code works as advertised.
I agree with the second part of your argument, regarding the assertion that LLMs may eventually replace programmers.
However, I don't understand your claim that an LLM acting as a programming assistant "...can’t be reasoned about or meaningfully examined/debugged."
I type something, and Copilot or whatever generates code which I can then examine directly, and choose to accept or reject. That seems much easier to reason about than what's happening inside a compiler, for example.
If using an LLM meant carefully crafting a complex, precise, formal prompt that specified only one possible output, I might be interested. But then I wonder if the prompt would be very much shorter.
Thinking about it, this depends on which differences we consider aspects of the output program, and which ones we consider trivial differences that don't count. If you say "build an RPG about dragons with a party of magic using heroes" and the LLM spits one out, you reached a level of abstraction where many choices relating to taste and feeling and atmosphere (and gameplay too) are waved aside as trivial details. You might extend the prompt to add a few more, but the whole point of creating a program this way is not to care about most of the details of the resulting experience. Those can be allowed to be generic and bland, right? Unless you care about leaving your personal touch on, say, all of them.
But this localization makes it computationally possible, and has limits.
The qualification and frame problems, combined with the very limited computational power of transformers is another lens.
LLMs being formalized doesn't solve the problem. Fine tuning and RAG can help with domain specificity, but hallucinations are a fundamental feature of LLMs, not a bug.
Either a use case accepts the LLM failure mode (competent, confident, and inevitably wrong) or another model must be found.
Gödel showed us the limits of formalization, unless we find he was wrong, that won't change.
Thanks for your insightful comment. I'll read the links later.
I had just assumed that RNNs were TC, didn't think of limitation put on by bounded precision since I assumed that any bounded precision could be compensated by growing memory module.
> As discussed in the paper and pointed out by the reviewer, the growing memory module is non-differentiable, and so it cannot be trained directly by SGD. We acknowledge this observation.
Two stack FSA/RNN are interesting, but as of now, not usable in practice.
I don't buy that you're actually examining compiled programs. Very few people do. Theoretically you could, but the whole point of the compiler is to find optimizations that you wouldn't think of yourself.
The point of an optimizing compiler is to find optimizations which, crucially, are semantics-preserving. This is the contract that we have with compilers, is the reason that we trust them to transform our code, and is the reason why people get up in arms every time some C compiler starts leveraging undefined behavior in new and exciting ways.
We have no such contract with LLMs. The comparison to compilers is highly mistaken, and feels like how the cryptocurrency folks used to compare cryptocurrency to gestures vaguely "the internet" in an attempt to appropriate legitimacy.
A big feature of compilers is to find optimizations you wouldn't think of. I tried to make the point that compiled output is typically not read by humans
> I don't buy that you're actually examining compiled programs. Very few people do
I take it you don't write C, C++, or any language at that level? It is very common to examine compiled programs to ensure the compiler made critical optimizations. I have done that many times, there are plenty of tools to help you do that.
I think you’re assuming your reference is the correct one. I can’t reason about the assembly language that the compiler spits out, the microcode in the CPU kernel or any of the electronics on the motherboard. That anyone can or not doesn’t change things in my opinion. It’s an arbitrary distinction to say _this_ abstraction is uniquely different in this very specific way.
LLMs are deterministic if you force a seed or disable sampling. They however do not guarantee that small input changes will cause small output changes.
So as the OP said, all the parts are deterministic in this stack. Their behavior is fixed for a given input, and all the parts are interpretable, readable, verifiable and observable.
This is entirely different from LLMs which are opaque even to their designers, and have unpredictable flaws and hallucinations, they are probability machines based on what data they have been exposed to, which means they are not a reliable way to generate programs.
Maybe one day we'll fix this, but the current generation is not very useful for programming because of this.
Compilers are complex programs fraught with bugs. Modern microprocessors are hideously complex devices fraught with bugs. But at least we understand them in principle and practice.
LLMs are nonsense generators, you need a second device that can recognize correct programs to use them effectively. Only humans can do that all-important second part.
> Programming has always been about finding the next black box that is both powerful and flexible.
That's the opposite of programming. Programming is the art and science of developing reliable algorithms. You can treat programs as black boxes only after you're sure that they work correctly. Otherwise you're just engaged in a kind of cargo cult.
That the scary thing about the LLM fad: so many people seem so willing to abdicate their responsibility to actually think.
I 100% agree with you. Whenever I see artists buy their brushes I cringe. Real artists don't draw anything until they've grown a tree and raised horses to obtain the raw materials (wood and horse hair) to make their first brush.
Using a bought brush to paint and generating a painting via a prompt are basically identical.
If LLMs were actually good for programming, I would consider it, but they just aren't. Especially when we are talking about "assistants" and stuff like that. I feel like I live in an alternate reality when it comes to the AI hype. I have to wonder if people are just that bad at programming or if they have a financial incentive here.
There are a handful of cases where LLMs are useful, mainly because Google is horrifically bad at bringing up useful search results, it can help in that regard... or when you can't find the right words to describe a problem.
What I would like to see out of an AI tool, is something that gobbles up the documentation to another programming tool or language, and spits it back out when it is relevant, or some context aware question and answers like "where in the code base does XYZ originate" or w/e. the difference is having a tool that assists me VS having a tool spit out a bunch of garbage code. Its the difference between using a tool, and being used by a tool
> I have to wonder if people are just that bad at programming or if they have a financial incentive here.
I have similar feelings to you, but I want to be careful about making assumptions. That being said, I see so many people making hyperbolic claims about the productivity gains of llms and a huge amount (though not all) of the time, they are doing low value work that betrays their inexperience and/or lack of ability.
I have yet to see a good example of where an llm invented a novel solution to an important problem in programming. Until that happens -- and I'm not saying it won't -- I remain extremely skeptical about the grandiose claims. This is particularly true of the companies selling llm products who make vague claims about productivity benefits. Who is more productive, the person who solves the most leet code problems in a month or the person who implements a new compiler in the same time frame? The former will almost surely have the most lines of code, but they have done nothing of direct value. I point this out because of how often productivity is measured in lines of code and/or time to complete a problem with a known solution.
So for me, when people brag about how much more productive they are with llms, I wonder, ok, well what are you building? I feel like llms are as likely going to make people build fragile bridges to nowhere at scale as anything truly revolutionary.
I am not expecting novel solutions from LLMs. I purely use them as a tool in my tool belt.
Some examples:
Deciphering spaghetti code: LLMs generally are pretty good at picking apart code blocks and generally explaining the functional parts. A while ago I was dealing with code that had lots of methods on single lines with tons of conditions. I put in in chatGPT, asked it to go over it and it gave me a point by point explanation of all the logic in there. Again, I don't expect it to be perfect here, it doesn't need to be. The way my mind works once I have the explanation I can much easier go to the single line mess and follow it along. If chatGPT messed up I will see that, but I will also be much further along already with deciphering as I would have been doing it manually.
Getting a quick start on technology, specifically if it is something I know I will only need to know once it helps me avoid tedious google searches. Instead I get a pretty decent rundown of whatever it is I need to know as well as some basics.
In short, I don't they are miraculous technologies transforming my work. But, they are pretty good at removing some of the more tedious tasks letting me focus on other things. So they do make me more productive in that aspect.
If you are an expert in your field and are working on the same project day in day out, I don't think LLM's will offer you much.
If you are a one man shop, where you have to work with: Javascript, Typescript, Haxe, PHP, bash, WordPress, BuddyPress, npm, Selenium, Playwright, Jest, Kha, Three.js, HTML, CSS, Bootstrap, Tailwind, Nextjs, SQL, ... . And that is all next to marketing, devops, managing freelancers, ... . Well, then an LLM is a super fast and super cheap junior of everything that can quickly create something you need in seconds.
Edit: Let me make it more concrete with an example of the previous days. I registered a new domain and wanted to already have a quick landing page on there that sends people to a certain url. It would already use Nextjs & tailwind so that I can test out the setup on the server.
So I wanted to generate this quick landing page. I have a few options:
1. Do it myself, which will take some time digging into the css to style things.
2. Hire a freelancer, which is more expensive, and would take up more time.
3. Let ChatGPT generate the initial version in 3 seconds, I can suggest some changes and get a full reply in 3 second.
Same for a lot of other things that I do. Will ChatGPT help me out in a big, complex application that I'm writing? Probably not. But it sure has its uses for a lot of small things.
> an llm invented a novel solution to an important problem in programming.
This will likely not happen and is a horrible benchmark for productivity. The advantage of using LLMs as a partner for coding is that I myself will have more time to generate “novel solutions” since a lot of the low level stuff that I still need to write can be done in less than half the time.
I never trust it to write code I can’t write for myself and things I can’t make tests or verify.
It’s not about generating more lines of code but having more time to think of the more mentally demanding stuff. It’s like having a junior developer that can work really fast but I will still need to check what it does.
As a very personal benchmark, at work I know how much time it takes me to build repeating things from scratch that can’t be automated. By my estimate, having the LLM saves me around an hour or two of coding a day. That doesn’t replace my time but those 20-40 hours a month of time saved is worth the 20 bucks on average I pay for it.
Tldr; it won’t make anything novel but it gives me more time to make things that are
There's a large continuum between great and crap, and it sounds like you've placed a rather high bar to even consider using it.
I don't like BASH scripting. I wanted to automate a certain task and dump it in a justfile for convenient reference.
Learning BASH scripting would be a poor use of my time - I didn't value the knowledge I would gain.
Using Google to piece together everything I needed would have been very painful. Painful enough that I simply didn't bother in the past.
Asking an LLM solved the problem for me. It took about 6 iterations, because I had somewhat underspecified and the scripts it returned, while correct, had side effects I didn't like.
But even though it took several iterations it was infinitely more satisfying than the other options. Every time it failed I would explain to it what went wrong and it would amend the script.
It's like having an employee do the work for me, but much much cheaper.
That's the power of LLMs. They enable me to do things that just weren't worth the time in the past.
Would I use it for my main programming work? No. But does it increase my productivity? Definitely.
That sounds awful to me. I spent maybe 1 or 2 days reading the woolidge bash guide, and the Dylan araps bash Bible and now I have that skill forever. Sure I spent more time practicing but I can craft exactly what I want without even thinking about it. I value that knowledge. Use shellcheck and the bash lsp and that's it.
But they way you talk about makes me feel weird. It honestly sounds a little insane.
If we're going with anecdotes, I have learned Bash scripting and zsh scripting separately at different points in my life. Both times I forgot it very quickly. My guess is that you value that knowledge and that helps you retain it. Practice of course helps.
For me, my primary shell both at work and at home is xonsh, which is Python-based, and 90+% of all shell scripting I do is in Python in that shell. Having that knowledge of Bash is not even worth 2 days for me. If I were an embedded developer or a system administrator where I often have to SSH into accounts I don't control I could value Bash more but that's not the case for me and a fairly significant percentage of software engineers. Why spend a few days learning it when I don't need it? Even in the example above, I did it for my convenience, not because I needed it.
That's not what I find insane though, nor is it what I said. The fact that this person is willing to sink in a ton of time iteratively coercing an LLM to spit out something that they themselves admitted had negative side effects, and didn't work that well, and describing it like hounding a personal employee in the name of "productivity" is like 7 layers of insane to me. All to avoid spending, probably less, time learning. Like I'm baffled at what motivates people who are like this. It's gotta be money right?
I'm a former/old-school professional programmer (now a retired CIO) and I still love automating my life with an assortment of tech (e.g. python, bash, c#, c++, jscript, etc). My knowledge of these tools isn't indepth... so being able to use AI to assist with generating the base code is a godsend.
My experience using chatgpt has been quite positive overall... sometimes it comes up with elegant solutions, sometimes its dog shit - but generally it's a useful starting point.
I know jack squat about programming. I could at one point do “hello world” in Python, if I recall correctly. Thanks to ChatGPT I now have scripts to make my life easier in a bunch of ways and a growing high-level understanding of how they work. I can’t program, an I won’t claim to. But I can be useful. Thanks to LLMs. (And before the catastrophists arrive: I know enough to be wary of the risks of running scripts I don’t understand, which is why I make a point of understanding how they function before I proceed)
Thats great for sure, but I still think you should take the time to learn the basics. Also, go to the r/bash subreddit and search "chatGPT" and see just how many people end up there with:
"HELP! ChatGPT destroyed my system" most commonly people want a command that will move pictures or something and the LLM spits out something feasible, they turn it into a shortcut on their mac and click it, and it is literally just 'find -type f -exec mv {} ..' with relative paths and it moves all of their critical files into random places. It is quite literally something I've seen happen at least 5 times.
There is a lot of benefit in just learning a little bit and then having the flexibility to write anything you want.
I agree I should learn to do it from scratch. And I’ve been given an added motivation to do so by seeing how useful just a few lines of code can be.
And you’re not wrong. I’ve learned the benefit of versioning when working with LLMs because if you’re not careful asking it to do one change can easily break something else. It’s far from a panacea
I'm starting to think that the people that moan the most about LLMs being terrible, might just be terrible at writing good queries.
Like everything else: garbage in, garbage out.
EDIT: I was not aiming this comment directly at you. But I've had a couple of devs try to convince me that tools like ChatGPT or Claude is garbage, and then use extremely short queries as proof.
"Write me a website with [list of specs]", and then when it either fails or spits out half-baked results, they go "See? It's garbage!"
On the other hand I've seen non-coders create usable tools, by breaking up the problem and inputting good queries for each of those sub-tasks.
Could be. I try to be verbose, but it also gets annoying to do. I usually just use some really handy GitHub search tricks and learn from other people's implementations.
>If LLMs were actually good for programming, I would consider it, but they just aren't. Especially when we are talking about "assistants" and stuff like that. I feel like I live in an alternate reality when it comes to the AI hype. I have to wonder if people are just that bad at programming or if they have a financial incentive here.
solid points there.
it is surely some of both reasons. for the bad programmers, it will be the former. for those invested in llms, it will be the latter, that is financial incentives - to the tune of billions or millions or close to millions, depending upon whether you are an investor in or founder of a top llm company, or are working in such a company, or in a non-top company. it's the next gold rush, obviously, after crypto and many others before. picks and shovels, anyone?
and, more so for those for whom there are financial incentives, they will strenuously deny your statements, with all kinds of hand waving, expressions of outrage, ridicule, diversionary statements, etc.
that's the way the world goes. not with a bang but a whimper. ;)
I'm not sure whether I'm a "good programmer" or a "bad programmer" but sometimes I just want a problem to go way in the quickest way possible.
I'm not always trying to create a timeless, perfect, jewel and there is a limit to how much I want to follow every highway and byway needed to do stuff across several dozen languages, libraries, platforms and frameworks.
>I'm not sure whether I'm a "good programmer" or a "bad programmer" but sometimes I just want a problem to go way in the quickest way possible.
True. Most programmers would think the same, at times.
>I'm not always trying to create a timeless, perfect, jewel
No one is, most of the time. Only, some people try to create somewhat good things some of the time, even given constraints.
>and there is a limit to how much I want to follow every highway and byway needed to do stuff across several dozen languages, libraries, platforms and frameworks.
Who has the time to do it, unless one is independently wealthy, so don't need to work, and is programming just for fun (although many of us do it for fun, part-time at least).
Yes, my sentiments exactly, and I am sure it's that of many other programmers, too.
The abstraction upon abstraction upon abstraction (Howdy, Java, but not only it) and the combinatorial explosion of technologies X their version(iti)s, is hell - like DLL hell on Windows, except much worse.
>Some days I'm just tired.
So yeah, I hear you, dude, and feel your pain.
But the topic and argument was about whether llms reduce that pain enough to be worthwhile. I guess the answer is: different strokes for different folks.
I asked both ChatGPT 4o and Claude 3.5 Sonnet how many letters there are in the word strawberry and both answered “There are two r’s in the word strawberry”. When I asked “are you sure?” ChatGPT listed the letters one by one and then said yes, there are indeed two. Claude apologized for the mistake and said the correct answer is one.
If the LLM cannot even solve such a simple question, something a young child can do, and confidently gives you incorrect answers, then I’m not sure how someone could possibly trust it for complex tasks like programming.
I’ve used them both for programming and have had mixed results. The code is always mediocre at BEST but downright wrong and buggy at worst. You must review and understand everything it writes. Sometimes it’s worth iteratively getting it to generate stuff and you fix it or tell it what to fix, but often I’m far quicker just doing it myself.
That’s not to say that it isn’t useful. It’s great as a tool to augment learning from documentation. It’s great at making pros and cons lists. It’s great as a rubber duck. It can be helpful to set you on a path by giving some code snippets or examples. But the code it generates should NEVER be used verbatim without review and editing, at best it’s a throwaway proof of concept.
I find them useful, but the thoughts that people use them as an alternative to knowing how to program or thinking about the problem themselves, that scares me.
Sorry, but this 'benchmark question' really isn't all that useful. Asking an LLM questions that can only be answered at the letter level is like asking somebody who is red-green colorblind questions that can only be answered at the red-green level. LLMs are trained by first splitting text into tokens that comprise multiple letters, they never 'see' individual letters.
The 'confidently answering with a wrong solution' aspect is of course still a valuable insight, and yes, you need to double-check any answer you've received from an LLM. But if you've never tried GitHub Copliot, I can recommend doing so. I'd be surprised if it doesn't manage to surprise you. For me it was actually really useful to get those parts of code out of the way that are essentially just an 'exercise in typing', once you've written a comment explaining the idea. (It's also very useful to have a shortcut to quickly turn off its completions, because otherwise you end up spending more time reading through its suggestions than actual coding, in situations where you know it won't come up with the right answer.)
When asked to prove it, it spelled out the letters one by one, and still failed (ChatGPT asserted the answer is still 2, Claude “corrected” itself to 1). Only when forcing it to place a count beside each letter did it get it correct.
It’s not really about the specific question, that just highlights that it does not have the ability to comprehend and reason. It’s a prediction machine.
If it cannot decompose such a simple problem, then how can it possibly get complex programming problems that cannot be simply pattern matched to a solution correct? My experience with ChatGPT, Claude, and copilot writing code demonstrates this. It often generates code that on the surface level looks correct, but when tested it either fails outright or subtly fails.
Even things like CSS it gets wrong, producing output that on the surface seems to do what you asked but in fact doesn’t actually style it correctly at all.
Its lack of ability to understand, decompose, and reason is the problem. The fact that it’s so confident even when wrong is the problem. The fact that it cannot detect when it doesn’t know is the problem.
It generates text that has high probability of “looking” correct, not text that has a high probability of being correct. With simple questions like the one I posed, it’s obvious to us when it gets it wrong. With complex programming tasks, the solution is complex enough that it often takes significant effort to determine if it’s correct or wrong. There’s more room for it to “look” correct without “being” correct.
> But if you've never tried GitHub Copliot
I’ve used it for almost a year before I cancelled my subscription because it wasn’t adding much value. I found copilot chat a bit more useful, but ChatGPT was good enough for that. I still use ChatGPT when programming: as a tool to help with documentation (what’s the react function to do X, type questions), to rubber duck, to ask for pros and cons lists on ideas or approaches, and to get starting points. But never to write the code for me, at least not without the expectation of significant rewriting, unless it’s super trivial (but then I likely would have written it faster myself anyway).
Thanks for taking the time to answer so thoroughly :)
In that case I stand corrected, I'd just assumed you hadn't used Copilot because, to me, it was so more effective at aiding with programming that ChatGPT. But I suspect that very much depends on the use-case. I liked it a lot for e.g. writing numpy code, where I'd have had to look up the documentation on every function otherwise, or for writing database migrations by hand, where the patterns are very clear, and in those situations it felt like a huge time-saver. But for other applications it didn't help at all, or admittedly even introduced subtle bugs that were fun to find and fix.
After my free year of Copilot ran out I also didn't re-subscribe, because at this point I have too many AI-related subscriptions as it stands, but I'd definitely (carefully) use it if I had access to it via an org or an OS project.
To be completely fair, there are some things I did have success with getting code generated. For example, I made a little python script to pull fields out of TOML files and converted them to CSV (so that I could import the data into a spreadsheet). It did mostly ok on this (in that I didn’t have to edit the final code that much and it was in fact faster than writing it all myself).
But the cases where I find its code was good enough are 1) fairly easy tasks (ie I don’t need AI to do it, but it still saved some time), and 2) not that common for the type of development I’ve been mostly doing. The problem is that I’ve often wasted significant time to figure out whether or not it’s one of these tasks, so in the long run it just doesn’t feel that useful to me as a “write code for me” tool. But as I said, I do find AI a useful aid, just not to write my code for me.
I tried ollama with llama 3 and 3.1, code llama, phi3, zephyr, chatgpt 4, all of what tgpt (cli tool) offers, copilot and a couple others but I don't remember. I primarily use Golang, as well as C, Zig, bash and learning Rust.
copilot annoyed the shit out of me and I barely get any useful code from LLMs. I think the most help I get from LLMs is asking things like "what is this operator mean '~uint64'" or other non-common language constructs. I primarily will just pull up open-source code that is of verifiably high quality and learn from that.
I also find programming extremely enjoyable, my means of expression, and an art form. I have hundreds of side projects in my archive, maybe five of which have ever been used by another human. It's all for the sake of coding. Many of them are sizeable and many are not but they are almost all done as a creative outlet, for the joy of doing it or to satisfy a curiosity.
But I don't know man, I love coding with LLMs. It just opens up more things, I think on some projects I actually spend MORE time on traditional coding than I did in the past, because I used an LLM to write scripts to automate some tedious data processing required for the project. And there's also projects where the LLM gets me from 0 to 60 and then I rather quickly write the code I actually care about writing, and may or may not end up replacing all the LLM written code.
I'm sure it heavily depends on exactly what types of project interest you. The fact that LLMs and diffusion have both become fixations of mine also means I have a lot more data processing involved in lots of my projects, and LLMs are quite good at custom data processing scripts.
I suppose my suggestion to the author would be that perhaps their projects aren't amenable to LLMs in the way they want and that's fine, but don't lose hope that there are kindred spirits out there just because so many people love LLM coding; some of us are both and that may be more about what types of projects we do.
I that's a big gulf people don't appreciate. I don't enjoy programming. When I program a microcontroller for my hobby projects it's a means to an end. I would love a tool that takes in a natural language description of what I want and outputs code and LLMs are good at doing that for basic tasks.
This is a post that just reads as if the author is still in the “honeymoon” stage of their career where programming is seen as this extremely liberating and highly creative endeavour that no other mortal can comprehend.
I get the feeling and I was there too, but, writing code has always been a means to an end which is to deliver business value. Painting it as this almost abstract creative process is just… not true. While there are many ways to attack a given problem, the truth is once you factor in efficiency, correctness, requirements and the patterns your team uses then the search space of acceptable implementations reduces a lot.
Learn a couple of design patterns, read a couple of blogs and chat with your team and that’s all you need.
Letting an LLM write down the correct and specific ideas you tell it to based on what I wrote earlier means your free time to do code reviews, attend important meetings, align on high level aspects and help your team members all which multiply the value you deliver only through code.
Let LLMs automate the code writing so I can finally program in peace, I say!
I get that at some point you have to put food on the table, but why conflate the enterprisey, economical, object-oriented mess of things with your hobby?
You can, in theory, still program elegant little side projects with no pretense of business value or any customer besides, maybe, yourself.
I find that my work-coding and hobby-coding are different enough that they don't even feel like the same activity
When I write a side project I'm not doing it because I enjoy coding, I'm doing it because I enjoy problem solving, and I've thought of a potential problem that I could solve with code, and I want to prototype that idea. I'm not that fussed about elegance; ideally, I want to prototype that idea as quickly as possibly so I can see if its valid or not, so I'd much rather use any tool I can to get me there as quickly as possible.
This sounds like a nightmare to me. The last thing I want out of my work day is to attend more "important meetings" and "multiply" my value. This is the kind of thinking that makes us less human, just widgets that are interchangeable. No thanks.
I still very much felt like I was creatively crafting this [0] project even though the entire approach used the Claude project feature. I had to hand-write some sections but for the most part I was just instructing, reading, refining, and then copying and pasting. I was the one who instructed the use of a bash parser and operating on the AST for translation between text and GUI. I was the one who instructed the use of a plugin architecture to enforce decoupling. I was the one who suggested every feature and the look of the GUI. The goal was to create an experimental UI for creating and analyzing bash pipelines. The goal was not to do a lot of typing!
These high level abstractions are where I find the most joy from programming. Perhaps for some there is still some modicum of enjoyment from writing a for loop but for most people twenty years into a career there's nothing but the feeling of grinding out the minutia.
There's still a lot of room for better abstractions when it comes to interfacing with computing devices. I'd love to write my own operating system, CLI interface, terminal, and scripting language, etc from scratch and to my own personal preferences. I don't imagine I could ever have the time to handcraft such a vast undertaking. I do imagine that within a few decades I will be able to guide a computing assistant through the entire process and with great joy!
English, and other languages, are vague and imprecise. I've never understood why folks think they can write code "more efficiently" with a prompt rather than code? Are people willing to give up control? Let the LLM decide what is best? The same is true for generative art -- you get something, but you only have marginal control over what. I think this will always be something that is useful for the simplest things, simplest apps, simplest art, etc.. A race to the bottom for the bottom of the complexity stack. As problems become more complicated, it would take a great deal more prompt language to specify the behavior than code.
Nobody in their right mind gives up control or lets the LLM decide what's best. They take the output of the LLM as a starting point, changing it where necessary to accomplish their goals.
This is no different than starting with boilerplate code, e.g. from a tutorial or manual or other project (or, heaven forbid, an Internet search), and changing it for your specific needs.
As for imprecision, a developer is perfectly capable of writing imprecise, ambiguous, or just plain buggy code by themselves. If you think your code is better than that produced by an LLM (and I'm not saying it isn't), then by all means use it. But the fact is that an LLM, in some cases, can produce code better and/or faster than it would take some people to write themselves who don't want to spend several minutes or hours stumbling through finding the exact syntax or algorithm, let alone Googling for a good starting point.
As for the non-determinism that many people are decrying, this is no different from Googling and finding several examples, each of which is written for a particular use case, none of which ever seem to match yours exactly. Caveat emptor if you simply cut and paste without reading, just as with code produced by an LLM.
this is one of the most perspicacious comments I have ever read about llms, with the disclaimer that i have not read a large number of comments on the subject. I have read some however.
however, referring to the first paragraph of the comment above, the problem is that many people are not in their right minds :)
As I said, the LLM will be a good replacement for the simplest cases. The behaviors you describe are far more common among junior programmers working on simple apps.
> The behaviors you describe are far more common among junior programmers working on simple apps.
Or among expert programmers who need to code in something out of their domain for a small portion of their job.
Suppose you suddenly are required to write a VBA macro for Excel for your job. It's a one off task - not something you'll do repeatedly. Do you prefer learning VBA for Excel and crafting a solution or asking the LLM and verifying it's solution by looking at the docs?
Hint: If you use the macro recorder in Excel and inspect the code you are closer to the LLM end of the spectrum.
For most things I approach it from both sides: I give an English prompt but also provide some interface, function signatures, or even just a return type that constrains the LLMs output.
For example, I frequently have to implement Serde's traits for some serialization format or to marshal types, most recently to translate types from Rust to Qt QML's Javascript. By giving it some context (Serde traits and QT docs) I managed to do it with Claude in about an hour, which is roughly how long it would have taken me just to get up to date with the documentation if I had tried it myself.
LLM output can't really be trusted so I need to "proof read" it and convince myself that it is correct. In the language I use every day and have a high degree of fluency, it's faster for me to simply write what's in my head than to proof read unknown code. So how can LLMs make me more productive in actual programming?
I use an LLM to generate ideas, to rubber duck, to get a lead on unknowns, and to generate boilerplate occasionally. So I do everything except replace the coding part because that's what requires the most precision, and LLMs are bad at precision. And yet, people claim massive productivity gains in specifically coding. What am I missing?
It helps a lot if you use a strongly typed language with a strict compiler and get the LLM to write plenty of tests.
Then you need to understand that the tests are logically correct. The LLM is also good at documenting the functions, so you can review that matches your intentions and the code as well.
Also the LLM will pick functions from your own source code library to compose new programs for you.
So the reuse of your own well tested code should increase confidence.
> So how can LLMs make me more productive in actual programming?
Suppose you suddenly are required to write a VBA macro for Excel for your job. It's a one off task - not something you'll do repeatedly. Do you prefer learning VBA for Excel and crafting a solution or asking the LLM and verifying its solution by looking at the docs?
Hint: If you use the macro recorder in Excel and inspect the code you are closer to the LLM end of the spectrum.
For me LLMs are like programming power tools. Use them wrong and you can hurt yourself. Use them right and you can accomplish far more in the same amount of time.
People that refuse to program with AI or intellisense or any other assistance are like carpenters who refuse to build furniture with power saws and power drills. Which is perfectly fine, but IMO that choice doesn't really affect the artistry of the final product
> For me LLMs are like programming power tools. Use them wrong and you can hurt yourself. Use them right and you can accomplish far more in the same amount of time.
Fun analogy because if you're especially negligent you can injure yourself so badly you'll make programming forever more difficult than it needs to be or end your career altogether - like with a tablesaw cutting off fingers.
I use an LLM precisely BECAUSE I want to focus on the art. Like Davinci would use apprentices.
LLMs can do mindless drudgery just as well as I can, but in seconds instead of hours. There's nothing about remembering syntax, boilerplate code, forgetting a semicolon, googling the most common way of doing something, or combining some documentation to fill in the gaps that's even remotely "art" to me.
I never ask an LLM for what I'm artfully creating. I ask it for what I know it'll get instantly right, so I can move on to my next thought.
I have a lot of different thoughts as to why using an LLM feels "off". One I've been thinking about as of late is that it feels flawed to measure productivity by code velocity, i.e. lines of code written per hour.
Like, ideally, it shouldn't really take that much code to implement a thing. I like to think of programming as writing a bunch of levers, starting with simple levers for simple jobs, incrementally ratcheting up to larger levers lifting the smaller levels. Before too long, it'll feel as though you've written a lever capable of lifting the world...or at least one that makes an otherwise wickedly difficult project reasonably manageable.
If you say that LLMs make you more productive because it allowed you to finish a project that would otherwise take forever to write, then I'm skeptical that an LLM is the best solution. I mean, it's a solution at least, but I can't help but wonder if there's a better solution.
If the problem is that you lack the understanding to take on such a project, then perhaps what we really need are better tools for understanding. I myself have found that LLMs are great for gaining a quick understanding of languages that otherwise have sparse information for beginners, but I have to wonder if perhaps there's a better way.
If, on the other hand, the problem is that writing that much code would take forever, then I have to wonder if the real solution is that we need a better way to turn programming languages into patterns (levers) and turn said patterns into larger patterns (larger levers)
A partial solution works, but only partially well, and occasionally has consequences one has to reckon with
I'm of the opposite opinion: I've started enjoying programming much more after embracing LLMs.
* They are great for overcoming procrastination. As soon as I don't feel like doing something or a task feels tedious I can just delegate it to an LLM. If it doesn't solve it outright it at least makes me overcome the initial feeling of dread for the task.
* They give me better solutions than I initially had in mind. LLMs have no problem adding laborious safeguards against edge-cases that I either didn't think of or that I assessed wouldn't be worth it if I did it manually. E.g. something that is unlikely and would normally go to the backlog instead. I've found that my post-LLM code is much more robust from the get go.
* They let me try out different approaches easily. They have no problem rewriting the whole solution using another paradigm, again and again. They are tireless.
* They let me focus on the creative parts that I enjoy. This surprised me since I've always thought of myself as someone who loves programming but it turns out that it is only a small subset of programming I love. The rest I'm happy to delegate away.
> This surprised me since I've always thought of myself as someone who loves programming but it turns out that it is only a small subset of programming I love.
I am the same, and why many of my personal projects end up stranded. Once I've solved the tricky bit, the rest often isn't that motivating as it's usually variations on a common theme.
I held off LLMs for a long time, but recently been playing with them. They can certainly confidently generate junk, but in most cases it's good enough. And like you say can be used as a driver to keep going. In that regard they can be useful.
This is exactly how I use LLMs - I can automate the really boring parts. "Can you write me a Swift codable struct for the following JSON" will save my fingers and precious mental energy for the important and interesting parts.
It's like having a junior dev that doesn't complain and gets the work done immediately.
AI code suggestions as I type are however a different beast. It's easy to introduce subtle bugs when the suggestion "kinda looks right" but in fact the LLM had zero understanding of the context because it can't read my mind.
Same, these are all great points that I find as well. LLMs have made me a way more productive programmer, but a lot of that is because I already was an alright programmer and know how to take advantage of the strengths and weaknesses of the LLM. I think your last bullet point is most poignant, using Claude 3.5 I've been able to do tons of GUI and web programming, things I absolutely despise and refuse to do if I'm writing code by hand.
I sort of understand some of the vitriol that I see on HN but it is incredibly overblown. I don't really get a lot of the criticisms. LLMs aren't deterministic? Neither are humans. LLMs write bugs they can't fix? So do humans. LLMs are only good at being junior programmer copy paste machines? So are lots of humans.
My current project is training an LLM to do superoptimization and it's working exceedingly well so far. If you asked anyone on hacker news if that's a good idea, they'd probably say no.
I do sometimes get the impression that there will be a generational gap in ability to code between millennials and zoomers.
We had an overdemand for devs during late ZIRP early COVID leading to bootcamps and self taught pulling a lot of untrained into the industry. Many of them have left the industry.
Add to that the whole data science bubble and it’s bursting where we had tons of degrees and job openings for sort-of-devs. Lot of those jobs are gone now too.
Don’t forget the pull of “product management” and its demise outside big tech.
Now we have hiring freezes and juniors leaning on LLMs instead of actually spending an hour trying to solve problems.
I feel the same. I understand why others think ai is just another tool like intellisense, but for me intellisense and any other automatic refactor is a fixed algorithm that I understand and that I know exactly what it is doing, and I know that it is correct.
With ai I need to review the output but not because there may be some issues I didn't noticed, but because that may be issues the tool itself didn't noticed, so it's less of an "apply this specific change" but more of an "apply some change"
With intelliSense, I disregard about 98% of the suggestions. Do people do that with generated code? Doubtful. With an LLM, even that last 2% requires more effort because it creates weird and irrational bugs that I have to review.
Exactly, now replace code with AI generated art, photos, drawings, videos, music. Your employers couldn't care less if its convincing enough to ship. even better now that it only takes seconds to minutes.
We are at the cusp of creative destruction and we are only getting started. Ironically, blue collar jobs seem safe as there hasn't been a humanoid revolution and what I see in the white collar field is what blue collar workers experienced before the automation and offshoring of jobs
I think AI art is actually an interesting example. It’s mostly run of the mill schlock to replace clip art / stock images.
Adequate but not enjoyable. Lorem ipsum of visual art. Probably kills some basic graphic design jobs at the margin working on low budget projects.
Meanwhile big brands using big agencies will just incorporate it into their design process. McD Japan is a recent example. You still need a human with an eye and taste to be the editor.
But no one is reading or viewing AI art for pleasure. It’s all “that’s neat” (continues scrolling).
The chances of landing on an IT job for which I can take pride in, is 100% remote and that pays well is very low. To begin with, I dislike all the faang-like companies.
Besides, being good at programming makes it easier to deal with BS jobs that pay well, so it’s not that I suffer 40h/week.
To have all these thoughts, I think you'd have to have never really used an LLM to help you code, or to be almost comically closed-minded when you do. What they feel like when you actually use them is a combination of a better SO and a very prescient auto-completer. It does not at all feel like delegating programming work to a robot. No loss of artistry comes into play, and it's damn useful.
In an ideal world, our abstractions would be so perfect that there would be no mundane boiler-platey parts of a program; you'd use the abstractions to construct software from a high level and leave details be. But our abstractions are very far from perfect: there's all kinds of boring code you just have to write because, well, your program has to work. And generally that code is, if you look, most of your code. This because making good abstractions is really hard and constructing fresh ones is often more work than just typing out the different cases. If you think this is mistaken, I'd gently suggest you take a fresh look at your own code.
Anyway, that's where LLMs come in. They help write the boring code. They're pretty good at it in some cases, and very bad at it in others. When they're good at it, it's because what the code should do is sort of overspecified; it's clear from context what, say, this function has to do to be correct, and the LLM is able to see and understand that context, and thus generate the right code to implement it. This code is boring because it is in some vague sense unnecessary; if it couldn't be otherwise, why do you have to write it at all? Well you do, and the LLM has taken care of it for you.
You can call this work the LLM is displacing "art", but I wouldn't. It's more the detritus of art performed in a specific way, the manual process required to physically make the art given the tools available.
You could object that the LLMs will get better in the sense that not only that they will make fewer mistakes, but they will be able to take on increased scope, pushing closer to what I'd consider the "real" decisions of a program. If this happens -- and I hope it does -- then we should reevaluate our lofty opinions of ourselves as artists, or at least artists whose artistry is genuinely valuable.
Author needs to get into Bret Victor. Has no idea how much more fun he could be having.
Programming is a step on the way to access to the state space of information. When we get to that stage, programming will seem like a maze of syntax, that has its own idiosyncrasies that force you into corners or regions in the state space, just like any DAW plugin or 3D tool, or any tool at all that exists.
This might be the most "zoomed in" take on programming I've ever read (where a zoomed out take understands that software usually just enables a business to do business). I almost thought it was satire.
I feel like you have to drop this kind of thinking to get anywhere past intermediate, not to mention you become a nightmare to anyone who has even a touch of pragmatism about them.
I used to work as a programmer, but have since pivoted into analysis - so I just view programming as another tool in my toolbox to solve problems. My main goal is to deliver insights and answer questions.
Sad to say, LLMs have made me lazy coder for the past two years or so. But I do deliver/finish work much faster, so my incentives to keep using LLMs as coding co-pilots are overshadowing my incentives to write code the "old way".
And for what it is worth, sometimes these posts read like modern luddite confessions - the rants just sound too personal.
I don’t use an LLM largely because my current codebase has a massive amount of bespoke internal APIs. So LLMs are just useless and wrong for almost any task I use.
But this has led me to wonder if there will be gradual pressure to build on top of LLMs, which, in turn, will really only be useful with the tried and true. Like, we’re going to be heading towards an era where innovation means “we can ask the LLM about it”. Given the high capital costs required to train, I wouldn’t be shocked to see LLMs ignoring new unique approaches and biasing to whatever the big corps want you to do. For “accuracy”.
I just sense were about to hit an era of software causing massive problems and costs, because LLMs are rapidly accelerating the pace of accidental complexity, and nobody knows really how to make money off them yet.
LLMs will become as evil as calculators: they obviate the mathematical ability and create reliance in most, but are a magnificent aide to experts--those who could survive without a calculator albeit at the cost of speed.
As we approach this point, programming (program creation) will become a largely non-technical skill but the few who have that skill will be needed to solve the problems which LLMs create and which its operators don't have the know-how to fix.
I use LLMs as an alternative to asking a question on stackoverflow. And I'm not confident LLMs can currently save time on large projects: take a couple of hours to write the code or spend a few seconds to generate the code and then a couple of hours understanding and debugging it.
I agree with your sentiment. It's like asking a 3 year old to compose a symphony because they learned to play twinkle twinkle little star on a casio keyboard reasonably well.
I don't see it as a "judgemental piece to anyone that i am alluring to" (I think you mean alluding) but I do think it's an honest assesment of those who are attempting to rely heavily on AI.
It is a judgemental piece because the writer is putting a value judgement on people who use LLMs to code and those who don't. Its right after the piece you quoted.
> please don’t take this as a judgemental piece to anyone that i am alluring to. it’s fine to not find programming enjoyable. it’s fine to just want things to work. i am just disappointed at how the ones who care appear to be an ever dying breed.
I'm not saying the author shouldn't write that, write whatever you want. But you should own what you write.
I feel that a large divide in a programmer's perception of the utility of LLMs as an efficiency boost for coding is based on a large divide in specific skills. A very capable and experienced programmer, as I have often found in industry, may not be that fast at comprehending code that they did not write. On the other hand, I feel I am able to rapidly proof read code -- be it from my colleague in code review, stack overflow, or the output of an LLM. As a result, Copilot has let me write many Bash and Python scripts at breakneck speed. On the other hand, I have allowed LLMs to decide on the architecture of small programs before, and it has often created a garbage, unintuitive code structure. It is a tool that you must use to its strengths and your own personal strengths. If proof reading the code is slower than writing it for you personally, then it sounds wise to avoid the tool.
Not sure where LLMs will be in 5 years, can only talk about today and make some guesses about the future.
I use LLMs cause it takes care of the syntax and leaves me more time to think about the logic and how everything is connected together. LLMs still lack in the latter category.
For one-off tasks, it's truly a time saver. In the future, I see people trying to do more complex stuff with chains of LLM calls (already happening but it's still a WIP). Break a complex task into sub-tasks, then solve them sequentially to achieve something non-trivial in the end. I'm a bit skeptical though - not so much because of my skepticism of LLMs in general but more about humans not being good at maintaining code bases they didn't write (often even with code that wrote). You cannot trust LLMs to also maintain the codebase. There's too much context about the life of the code, all the dependencies, and even business logic, that is missing.
I think of myself as a code artist too but definitely not leaving the productivity boost of an LLM behind. The thing just helps me write stuff that I was going to write anyways!
I find the post confusing. The author notes two aspects of programming and then seems to conflate LLMs as doing both when they only do the second. It’s sort of like saying directing a movie isn’t art or creative because the director is not every actor in the movie?
I love to program! I started in 1993 when I was 12 and I'm 42 now. I've never lost my love for it. Been a workaholic most of my life because I just love to code. However... after using Claude 3.5 with Cursor I realize that it's over. I'm writing whole apps in a day, sophisticated ones, that would take me a couple of weeks before, and way more energy spent.
I wrote a whole Slack clone (not feature complete of course) in just 1 day. I love to code, and I do have to at times but this feels like the end of what was the joy of my youth and adulthood, my favorite pastime is gone. I will still develop applications and enjoy my creativity but with probably 1/50th the effort it took before.
I tend to agree with this post. I am the only person on my team and one of the few at my company not using LLMs to assist my programming. So far I don’t think I notice any difference (better or worse) … we’ll see how this experiment plays out I guess.
Rather than writing the code for me, I would appreciate the LLM functioning as a pair programmer, making comments on the decisions I make as I write the code. Another case that would be extremely useful, but is basically impossible now, is to have the LLM write tests for my new code as I write the code. Of course understanding the test scaffold and what's important to test is way beyond the capability of current LLMs. I only write integration tests, and never unit tests. An LLM could probably code unit tests, but I feel that not particularly useful. However, I could see this becoming more useful than actually writing code in the long run.
Lots of people I know use llms for different reasons and more importantly in different ways.
For me it feels like it enables me to express myself even better. I use 1 for basically a super charged intellisense or clever auto complete. Finishing lines as I am typing them. Exactly the way I was going to write them.
It saves time on the more boiler plate glue code and allows me to maintain a better flow and momentum on more expressive areas.
I don't see it taking away anything but only enabling me to do more, faster, and better. I'm not telling it to write full apps because it can't.
I feel like I wrote this post. Like the author, as a formally trained painter and printmaker turned developer, I find development akin developing a painting. Working general to specific and letting solutions reveal themselves as I work through the problem. Learning patterns and mirroring them, creating a logic other developers can follow as they work with the code in the future and providing an API for consumers to use that not only provides for the requirements but creates patterns that reveal the internal structure to the consumer. LLMs creating code to be consumed by LLMs basically means none of that craft matters. It also means humans won't be able to work with this code. It doesn't matter what developers do. This is the future because the LLM is cheaper than paying someone a living wage and the businesses will go with that. I used to bemoan consultant written "good enough" code. The future holds WTF is all this code even doing LLM code. Likely with hallucinatory variable naming. If you think about textiles or furniture there is a huge difference between what was made by artisans and what is spat out of industrial process. The problem programming faces is that although a person who buys a hand made rug can tell the quality, a person who downloads an app has no idea outside of battery use and laggy UI if the coding is poor. I can always go back to portraiture, I guess...
I suppose I get where the sentiment is coming from and anybody can use whichever tools make them happy but I feel like the comparison doesn’t make too much sense to me. As programmers we leverage so many levels of abstraction that help us write better code. It feels similar to saying if you use some package or library you’re letting some library author do the painting for you. Or if you use a high level language instead of assembly you’re letting a compiler do the painting for you.
There is a difference between having code and writing code. For example, sometimes I want to have code that I don't want to write. I often use libraries from package managers for this case. Using existing libraries frees me to write the actual parts of the program that I value (and/or enjoy) while skipping some of the boring parts I do not value (and/or enjoy).
It seems to me that there is a middle ground between writing programs myself and downloading code from a package manager. LLMs fill up some space within that gap.
I think refusing to use an LLM because of "reasons" is the same as refusing to use packaged libraries for "reasons". That is, reasons for both definitely exist but I consider that kind of stubborn intransigence to be a sign of mental disorder.
I recommend simply adjusting the criteria you use to decide when and when not to use packaged libraries to help determine when and when not to use LLMs.
Another important point: don’t foster a dependency on tools you don’t own. If you run an LLM locally, fine, but entwining your career with a SAAS product like OpenAI may prove to be a critical (and expensive) mistake down the line.
You’re not automating the “art” with llms. You’re automating stretching the canvas and squeezing out the paint. LLMs by design can’t create anything fundamentally new. Which is the “art” part of the whole exercise.
I agree with this article. Coding is the easiest and best part of the job. Way more time is spent testing than coding. Why take away the fun part, making your job only about the verification of it.
To be honest, I want an LLM to do my work, which means coding for me. But I won’t quit coding as my hobby. I love to build and tackle challenges. I just hate the pressure of deadlines, boring coding tasks, spaghetti code from my team members, code style differences between me and others, and so on. Coding has turned into corporate stuff. I do it for a living, but I really don’t like doing it with all these issues in the mix.
I like to code in the middle of the night, solving very complex problems and building cool applications.
Programming is just a mean to an end, please stop this cringe romantic rhetoric. If you really love programming you don’t care about the medium, the fact that a program is represented as text is just a transient phase in the history due to the current tech we have available in this specific moment in time. Programming (today) is expressed as text, LLMs auto complete boring text, so that us “the artists” can write more of it faster, end of story.
I love the craft of coding, and I still use ChatGPT every day to good effect. It writes the boilerplate code for me, and leaves the fun part to me. Don't underestimate the power of getting to a good, working solution faster.
Programming was mostly a hobby in the days of 8-bit PCs. It was a profession for some decades. Maybe it will be a hobby again in 5 years. Like gardening, sailing, fishing - professions at one time, now hobbies.
On the other hand, the arrival of futuristic capabilities like computers speaking human languages is what drew me to technology in the first place. Luckily, you can choose to look forward and backward. You don't have to pick only one.
I have noticed something recently in developer blogs. I only know it's not AI generated if it contains spelling mistakes. It's 99% not AI generated spam if it is completely uncapitalized. If the author fixed these "errors", my certainty that it's not AI generated garbage drops to 70%. For this reason, I may adopt an uncapitalized style in my own blogs, though I'm sure it will annoy many people.
Lol, yeah I was being cheeky here but when I see so many spelling / grammar related mistakes, I wonder can they not at least put the text through Microsoft Word or something?
Do we have grammar/spelling checkers for markdown editors or whatever most are using to write their blogs?
There is no one answer to the question of whether to use LLMs for code. LLMs help with robotic transformatioms, SDK knowledge, glue code, and even with trivial algorithms, but not so well with novel algorithms. Novel algorithms are where I ignore what the LLM says, and do my own thing.
If we're going to have LLM's writing code that we depend on, then we for damn well sure better have both typechecking and unit tests also in place. Hell, make it do TDD since humans don't seem very keen on that despite its empirical advantages
"LLM is not for me" sounds right. But, if you want to use LLM to avoid building the tedious parts of a project like the user interface, APIs, etc., and decide to code your algorithms the old-fashioned way, that's fine too.
What is up with these blogs that are apparently not made with the purpose of being read? Having sentences not start with uppercase letters really makes it one big mesh of letters. Certainly when they are run-on sentences only separated by commas.
Not helped by the choice of using a monospaced font. I get that it is often an aesthetic choice, but given that a blog post is written with the idea to be read, one I don't think is a particularly good one. Although the last time I made a remark about that on HN it became clear to me that a lot of people don't see the issue. Even if there are decades worth (at this point) of research that makes it clear that a sans serif font (or even a serif font on modern displays) works better for readability. ¯\_(ツ)_/¯
In this case though, the combination of the monospaced font, everything being in lowercase and the run-on sentences I really am scratching my head here.
Are you trying to get a message out there? Or are you mostly going for aesthetics?
It seems that they are doing it everywhere on their website, also in most of their recently started projects.
Ironically, while looking into it, I found that one of their projects seems to use generative AI to make it work.
It might be because of what you said, though the same way I asked an LLM to insert proper uppercasing I can also ask it to remove it. So it would be more for the "vibes" more than anything.
Below you'll find the fixed version an LLM provided. I simply asked it to add uppercase letters where appropiate and split up sentences where needed. You are welcome:
---
As LLMs get better and better at writing code, more and more people, at least on twt, have started to incorporate LLMs into their workflow. Most people seem to agree that LLMs have been a game changer for coding. They praise them for how much they have improved their productivity. They also mention how much easier it is to write code. Some claim that programmers who refuse to use them are "not using them correctly" and will eventually get left behind.
In my opinion, the effectiveness of LLMs in coding at their current state is vastly overblown. Even if LLMs were as good as what avid users of them claim, I still won't see myself using it in any meaningful capacity.
## The art of programming
Programming can be broken down into two parts. The first is solving problems algorithmically, breaking problems into steps that computers can follow within some constraints, thus forming a solution to the original problem. The second is expressing the solution in a way that the computer can understand.
Both parts provide the programmers with an infinite canvas on which they can express their creativity. There are practically limitless ways to approach and solve a problem, and a practically infinite way to express a solution to the problem. Hence, programming is a form of self-expression - it is an art form. What is produced through programming is a kind of art - an art few appreciate.
## I am a programming artist
In that sense, I see myself as an artist, one that expresses his creative self through programming. I enjoy creating programming art, because only through it do I find my true self, one who has a burning passion to create and build things.
## LLM is not for me
Using LLM to write code is like asking an artist to paint for you. If you only want the end result, by all means! If you are like me who enjoy the process of painting, then why would you bother automating the fun part away? One may say, "But I am only using LLM to write code. I am still doing the problem solving myself!". To me, programming isn't complete if I don't get to express the solution in code myself. It isn't my art if I don't create it myself.
## A sad reality
It is sad to me just how much people are trying to automate away programming and delegating it to a black box that can't even count letters in a word sometimes. They are even going as far as trying to emulate a software engineer on top of the black box. Does no one not find programming fun anymore? Does no one care enough about programming to go further beyond getting things working "well enough"? Is this just another case of availability bias?
Please don't take this as a judgmental piece to anyone that I am alluding to. It's fine to not find programming enjoyable. It's fine to just want things to work. I am just disappointed at how the ones who care appear to be an ever dying breed.
Seeing luddism in programming is hilarious to me. Keep writing your machine code old man, we'll pick up the slack for you while you fade into obscurity.
The changes into industry definitely coming. But AI will affect not only programming. And handmade products will cost more than autogenerated ones.
Programming started sucks before when companies started to orient on fast income and preferred pure quality templated projects over well-made efficient ones.
This world changes. Many people have nothing to do and with AI amount of them will grow. It scare.
This is true at any instant in time, but it fails miserably over time as support and feature enhancement becomes important. A poor design from an LLM that cannot be updated, fixed, maintained, and improved is an issue that affects customers.
All those generated code would still need to be maintained, no? What happen when the customer found a bug and demand it to be fixed? Would the AI able to find the buggy code and fix it?
Programming remains a deeply misunderstood, and often disrespected, medium. Any time someone tries to claim that we "won't need programmers anymore" in the future to accomplish X, Y, or Z, I simply behold somebody who doesn't understand the medium.
Of course I understand that the "sweet spots" of logic, design, and platform architecture shifts over time. I can work at a higher level of abstraction now than I did when I started out writing PHP in 1999. I'm also willing to believe low-code/no-code solutions or turnkey solutions will get better over time. There's a reason I don't try to find restaurant clients to build a glorified menu website for anymore.
But the hoopla around LLMs has only served to reinforce the disconnect between the people who express artistry in code and the people who see programming as nothing more than a "feature-implementation-pipeline" that's ripe for rote automation. I'd expect as much from folks in marketing, or CEOs, but to see so many CTOs make public claims they must privately know is simply bullshit…well, that's shaken my faith in this industry, I'll tell you that much.
I'm feeling optimistic though. The tides are turning against generative AI and when the hype cycle is finally over, we'll have a handful of narrowly-purposeful, ethically-engineered models which provide some specific yet useful productivity gains…and beyond that, it'll just be the slow-but-steady progress of the discipline of hand-writing code.
The only labor-saving advantage to these sorts of machines comes from them using other people's labor: my own work is not going to help me produce more work. And other people might know better than me about something, but not being able to trace the source of any output makes it impossible to analyze the chain of reasoning that led to it. It doesn't have use either as a creative tool, as a research assistant, nor as a mechanical part. Its almost a toy, except it steals everything you give to it.
The six year old English took me more effort to read than it should. The point of language is to communicate. Write in a way that people can read.
But to the point of the post, yeah, I agree. The people using LLMs are just using it to compensate for a horrible toolkit. It doesn't help them think or break down problems that nobody has seen before. If it makes them quicker it's because they were using caveman-level tools before. I haven't typed every character in the code I produce for at least a decade at this point. Most of my time and effort is in thinking. But for braindead code monkeys, sure, anything will make you quicker.
I have wondered whether one day people might attempt to prove their "realness" by writing in an intentionally stupid way that an LLM could never write.
The argument against this is usually “oh yeah are you going to write your own compiler? Or better yet build your own CPU and motherboard?” but some people do because it’s fun and educational. Ignorance isn’t a valuable skill set.
I don't do programming as art. I just want to see the result.
That said, in my experience, trying to get ChatGPT to spit out sensible code is sometimes like arguing with a conspiracy theorist. I ask a question, receive an answer, ask a follow-up question and scrutinise one of the details, and the interviewee just falls apart and grasps for other straws.
I think the arguments in this article/blog/etc. aren't as strong as they ought to be to make the point the author tries to make.
My formal education was in music composition and film scoring specifically. I never once, even then considered myself "an artist" or thought that I was pursuing art. I did consider myself a craftsman: I was learning a style of music composition where I was expected to realize someone else's vision for their purposes. One could argue that the filmmaker was an artist and the film a work of art, but as the composer my job was to produce a work product that conformed to a specification as directed by someone else. I cared about creativity, quality, and having my product fit for purpose. I do not see the craftsman as inferior to the artist, but the goals are different and in a medium such as film you must have both artists and craftsmen (and sure sometimes a single person is both, but that's not a requirement).
When I am a programmer I see myself in exactly the same way. Professionally, I'm paid to satisfy someone else's need for automation/data wrangling. The solution I deliver is a work of craftsmanship, not of art. I will be creative in exercising my craft: often I'm just given a statement of the problem that needs solving and the solution design and development are left to me. I may find unique capabilities or processes which are better than those anyone envisioned... but at the end of the day I am creating an expression of my craft, not trying to achieve a deeper artistic goal. Trying to achieve an artistic goal with programming, at least in most professional programming, would likely lead to some sort of malpractice: the craft of the product diminished in time, cost, or features to gain an unasked for message.
Another issue I take with this article is that it confuses the tools with the result. Today, artists that paint use a large variety of brushes, spatulas, and other tools and paints which have all the benefits of being formulated with the knowledge of modern chemistry for stability, texture, and durability over time. Would the artist of old be justified in suggesting that using the new tools make you not an artist since you don't have to control for the limitations of the past? As a composer, I used sample players, sequencers, notation programs, and DAWs, even when writing for a conventional orchestra. True Beethoven didn't have these tools, but does this mean a modern composer that might use the same tools as me don't produce art... even if the final result is played by a traditional orchestra? I find LLMs to be in this category. When producing art (or craft as I discussed), the final product is ultimately what is to be judged... unless you find programming as being more about performance art where the journey is more the product that the final result.
Anyway, as a programmer I absolutely use LLMs. Sometimes they help, sometimes not. Even at my most creative, I don't find developing a good regex or a shell script something I care to be creative in approaching. How I apply the LLM, my judgement of the quality of what it produces, or whether I think the kind of "lowest common denominator" approach an LLM will produce is appropriate or not in a given situation still puts me and my creativity plainly at the center of producing a craft-full, and perhaps artful, product.
To be fair, we're almost 2 years from the release of ChatGPT and the "demise of the programmer".
We've already did the speedrun from people styling themselves "prompt engineers" to people openly mocking people styling themselves "prompt engineers".
And now it seems we're going to be "6 months to a year away from replacing programmers" for as long as it generates more funding.
This feels exactly like the manifesto that artists on X started defending that Ai-generated Art isn't Art but look where we are now, companies are able to generate not only quality Art assets but photos too but amateur consumers creating songs, videos, pictures that they couldn't have previously.
Unlike art where you can tell if the end product is AI generated even, software is not. The end users do not care if you used AI to generate your react front-end or back-end, your employers dont care who wrote the code as long as its bug free and works like in the scope document.
Ironically, I don't see software developers writing manifestos or complaining that AI generated software isn't software or that they are stealing from them.
Seems to me like art is art if the person who made it feels that it's art, and that may or may not be a component of the process that went into an apparently artistic product. Artistic products seem more likely to be "art" of there's no specific value to them, and they exist to exist or to add some abstract element of decoration to world. A former theatre friend of mine once got mad when I described art as having no intrinsic value, but that doesn't mean it has no value, and I don't think it fundamentally changes the non-art noun of what's produced; a painting/print can be just a decoration, and/or it can be art, but the fact that it's art doesn't change whether it's a painting or a decoration or the type of object. That may not be the case of course if the resulting object is an artistic illusion, such as a table that's not a functional table because it's made of cake, but again I don't think it needs to get that deep for the point to be true.
Likewise, software that was produced artistically may or may not be better, or more valuable, or distinguishable in any way from other software products, but if the author feels like it's art it may be art. That may be because there's no tangible reason for it to exist other than a creative endeavor, in which case maybe it's not actually software.
You presented 2 broad generalizations about "software developers" and "artists" along with your statement about this specific "manifesto", so that's what I was referring to.
Artistically produced software though could certainly be a CRUD app just as much as it could be a photo that you pulled off the shelf at Target, I'm arguing that the type of product it is doesn't necessarily have a bearing on whether it can be art or have elements of art in it.
Many people use the term in different ways to describe a kind of abstract ambiguously valuable process of creation, or the product of it. I think it's possible to interpret software as art, but it's not a necessary quality for the software to exist.
For purely artistic works of software, my opinion is that they'd pretty much serve no explicit purpose at all other than as a creative exploration of some sort, completely open to interpretation with no success or failure case. Like the webfont Candy which was made with CSS shapes, creative coding, or various kinds of digital illustrations. Exploring what you can do with ai generated imagery is surely one of them, but not necessarily so if it's just the solution to a problem.
Most things people would describe as "having an art to it" or "art" have nuanced colloquial interpretations, but it's usually just an aspect to the creation process that embodies some of these qualities. Someone could say "there's an art to sucking as bad as you do at X" and although it's meant to be figuratively derisive rather than literally artistic, it could also refer to the abstract nebulous means by which someone fails to be good. Likewise, it could mean the abstract nebulous process by which someone makes CRUD apps good/bad.
I take it you're not using a compiler to generate machine code, then?
Scratch that, I guess you're not using a modern microprocessor to generate microcode from a higher-level instruction set either?
Wait, real programm^Wartists use a magnetised needle and a steady hand.
Programming has always been about finding the next black box that is both powerful and flexible. You might be happy with the level of abstraction you have settled on, but it's just as arbitrary as any other level.
Even the Apollo spacecraft programmers at MIT had a black box: they offloaded the weaving of core rope memory to other people. Programming is not necessarily about manually doing the repetitive stuff. In some sense, I'd argue that's antithetical to programming -- even if it makes you feel artistic!