This raises the question: what information did Amazon Q ingest to be able to write C64 Basic, and from where – OCR'd books and magazine off Google Books? Online tutorials? That would explain whether it would be possible to adapt this workflow to supporting other relatively obscure platforms, with a limited documentation set that's certainly not available online on the internet in easily parsable HTML: e.g. PDP-11 assembly, Turbo Pascal, classic Macintosh/Macintosh Toolbox, etc.
Who knows, it might be a shot in the arm for retrocomputing enthusiasts.
This one's probably pretty well covered, actually. All of Apple's Inside Macintosh documentation is available in PDF format, and there's plenty of old programming books and magazines which have been scanned and OCRed.
Inside Macintosh has very, very few code examples (the only one I remember is the example program at the start of the phonebook edition)
Without large code examples, can a LLM write even halfway decent code?
> there's plenty of old programming books and magazines which have been scanned and OCRed.
In my memory, the heyday of printing programs in magazines was before the Mac. Macintosh programs were too large to print in a book or magazine, and other methods to distribute code were available (ftp servers, CDs)
I would think a good LLM for assisting working on the Mac toolbox needs data from ftp sites and CDs.
I find even local LLMs like Llama can do a fair bit of retrocomputing, from BASIC to 6502 and Z80 assembly language. I doubt very much that they were specifically trained to do this; I think it just comes for free because the broad training these LLMs get from random text taken from the Internet contains this sort of stuff.
Makes sense to me. Part/most of the appeal of coding 6502 or z80 for a retro platform is just how deterministic and predictable they are down to the clock cycle. AI is the opposite.
I've noticed that LLMs have a hard time remembering all the constraints of 8-bit programming. Like sometimes it assumes that 6502 registers can have a value above 255, or in C it assumes that ints have 32/64 bits.
Also, if you have an array living in the zeropage, and you use Zeropage,X or Zeropage,Y instructions to try to access beyond address FF, it will wrap back to 00. Because it's a zeropage instruction that can't go outside the zeropage.
$101? $104? This looks familiar. If you try to train an LLM on series of lottery ticket numbers, say 1-50 and then ask for a set of numbers based on the training data LLMs sometimes will give you numbers like 110 or 101.
Author of the post here. Just reading the comments so apologies for getting some of the terminology wrong. The intention was never to mislead folk , just wanted to share my enthusiasm for emulation and the fact that you could get working code.
Interesting article, but the title is a bit off. "Assembler" actually refers to the tool that converts assembly language into machine code, not the language itself. So "writing 6502 assembler" would technically mean writing the assembler software, not writing assembly code for the 6502 processor.
It's a small distinction, but surprising to see this mix up as assembly language enthusiasts tend to be sticklers for these details!
As someone who's been writing assembly language for decades going back to Univacs, most people used "assembler" as shorthand for "assembly language" or even "assembler language". It's usually quite unambiguous.
Here are examples of such usage, from 1967 and more recent:
Seems like they always start with "the assembler language," which I take as "the language of the assembler," and then they sometimes get sloppy after that. I've never heard someone say "assembler language" (or maybe I just tuned it out.)
I don't know anyone who doesn't say "I wrote X assembler" with complete understanding by all involved, and I definitely don't know anyone so pedantic they said "acksually, it's 'I wrote X assembly code'". I guess none of the dozens of assembly code makers or whatever I've know over the last 40 years was enough of a stickler. Or care one way or another.
I also understood the title to mean writing an assembler rather than writing assembly language code, and I've never heard anyone refer to writing assembly as writing assembler (or heard anyone who writes assembly referred to as an "assembly code maker", nor anyone who writes in any language referred to as an "<language> code maker").
I could imagine such phrasing being done by non-native English speakers, of which I'm have no doubt that there are a significant number.
My (unresearched) guess is that this is simply different dialects of speakers emerging with respect to informal references over the decades.
Using “assembler” instead of “assembly” was common enough back in the day that there was no confusion. There were 100x more people writing “assembler” than writing actual “assemblers” so you know, the odds were good.
It seems to be pretty common even among native English speakers to use "writing assembler" and "writing assembly" interchangeably. If one were writing the tool that assembles to machine code, you'd say "writing an assembler".
Normally I'd agree (as a native english speaker with about twenty years of writing assembly under my belt), but the title tripped me up, too. I figured they forgot an "an" or an "s" at the end of "assembler" and i was surprised to find that no 6502 assembler was produced. It could be because I've written three different assemblers over the last two years, though, so it could be i was just projecting my own interests.
I actually prefer "writing assembly", and also think "writing assembler" is a bit confusing. But it feels like I'm several decades too late to complain about it ;)
I've never heard anyone refer to writing assembly as writing assembler
I used 'assembler' back in high school, when I was learning about the 80x86. I remember because I was 'corrected' by fellow student who had never touched assembler, assembly language, machine code mnemonics, or whatever you want to call it.
I have no idea where I got the terminology, but I was reading a lot of books and Usenet posts on the subject at the time. I'm a native English speaker, for what it's worth.
> I used 'assembler' back in high school, when I was learning about the 80x86. I remember because I was 'corrected' by fellow student who had never touched assembler, assembly language, machine code mnemonics, or whatever you want to call it.
Wow, you got the Hacker News experience 30 years in advance!
I agree; I read the headline as meaning a 6502 assembler was written, as opposed to 6502 assembly being written.
Writing compilers gets difficult, quickly. But writing assemblers is/was common enough for simple architectures and it's quite fun and relatively easy to prove/test for correctness. At least compared to most compilers.
One of my first gigs out of school involved writing a Z80 assembler because I "needed" nonstandard (Sharp/DMG/8080) instructions to be handled during a codebase port. It was enjoyable! I recommend everyone write an 8-bit assembler at least once!
Still, TFA is very interesting and I appreciate OP's share. :-)
I wrote my own 6809 assembler [1] largely just because, and I ended up adding a 6809 emulator to it, so I can run tests during the assembly phase. I'm only aware of one other assembler (a 6502 one) that can do this. It's a fun project.
There was the time I wrote a “data-assembler” for the AVR8 because I was packing up rather complicated data structures to represent parts of graphics that appear on a persistence of vision display and how the parts are assembled into images. If you look at any assembler you see there are a lot of facilities for constructing data as well as for constructing code and I took one approach to that problem, outputting C code for an array that the system can read out of the (comparatively large) ROM.
In my experience, this is a common enough usage variation that I'm not sure how helpful it is to treat it as an error. In particular, "assembler language" seems to have been IBM's preferred phrasing at one point.
I had the same reaction. I first thought, well it's not that difficult to write an assembler for a simple 8-bit instruction set, but then upon reading the article it looks like it's instead writing programs in assembly language. Totally different.
Non-native English speakers might also be a contributor to this, for instance in Swedish the words are the same so "assembler" can refer to either the language or the tool.
As usual in natural language it's pretty clear from context which one is meant. Not sure about other (larger) European languages, it seems it's possible usage in German but it's not the primary case from a quick search.
I made a living writing assembler as early as 1986. We used assembler and assembly language interchangeably, even though it’s true that the actual executable software that parses and generates code is called an assembler.
Reminds me a lot of BASIC I wrote back in the day, particularly the code for the bouncing sprite.
Seriously though it makes me think of how hit-or-miss Microsoft Copilot is at writing code (we have a special license to use it at work.)
For certain things such as writing short bash, CMD.EXE and PowerShell scripts it does great. It writes great list comprehensions in Python. Can convert code defining a set of typed dicts to a set of dataclasses. Can write a SQL query using an obscure (to me) feature and then rewrite it in JooQ.
But write a CTE expression in JooQ? It doesn't understand how to break the circularity.
Configure Vite? It will insist on the same wrong answers ceaselessly. On the other hand, if you look at StackOverlow the answer seems to be "you can't get here from there" or "there is this plugin that might help if it worked but it doesn't."
Who knows, it might be a shot in the arm for retrocomputing enthusiasts.
reply