Nice tutorial. One small tip for the website maker. in the header you have the home button as <a href="index.html">Legion CPU</a> instead of <a href="/index.html">Legion CPU</a> so it gives a 404 because its appending it to the url.
"We use the digits 0 to 9 to represent 10 unique values, which not uncoincidentally matches the number of fingers on our hands."
I disagree with this statement, humans use fingers on one hand to represent 0-5 (6 distinct values) and fingers on two hands to represent 0-10 (11 distinct values)
My problem with these "part 1 making a game" tutorials is that they usually never make it past part 3... This article only covers 1BPP byte graphics, and assumes that all 65(c)02 machines use them which is absolutely not the case. Graphics for 8-bit machines are usually handled by a separate processor with its own format(s) anyway: VIC for C64, ANTIC for Atari 8-bit, PPU for the NES, etc.
You could check out the Nerdy Nights tutorial for NES development which is fully fleshed out and has a tried and true record of getting people started:
https://nerdy-nights.nes.science/
Apologies to the author if this seems harsh and disheartening, but, i've seen too many tutorials both never finish and give out incorrect information.
I agree, but you need to give this person the benefit of the doubt. They haven't yet "not finished" since this just came out yesterday. This "unfinished" issue is also a problem with regards to tutorials for writing emulators. It's amazing, given how popular writing NES emulators is, that there is not yet a great step-by-step written tutorial for writing one (there IS great documentation and some great video tutorials (see One Lone Coder for example)). This is why one of the big projects in my next book will be a detailed tutorial showing how a simple NES emulator is built.
Perhaps i'm projecting my own frustrations trying to learn retro dev, which is unfortunate and maybe immature on my part, but i've been burned too many times to shut up.
> Simple NES emulator
The timings get really tough between the PPU / CPU with all of the scanline tricks you can do, even with an NROM setup. Many 6502 instructions don't behave like you think they would with page boundaries, there are some hardware errors. The PPU / registers get all kinds of weird with their behavior too, such as $2003/$2004 (sprite addr / data) basically being broken on real hardware and Sprite DMA only falling on even cycles, etc. Each mapper is it's own unique snowflake as well, the MMC3 and VRC6 scanline interrupts work completely differently in hardware. There are many more examples of strangeness.
I'd love to see someone tackle it and wish you luck.
> The timings get really tough between the PPU / CPU with all of the scanline tricks you can do, even with an NROM setup.
Right, I agree. The PPU's intricacies are what make it difficult to write a good tutorial. When I wrote my own emulator as a personal learning project in C I did it myself except the background rendering of the PPU which I ported from a popular cycle-accurate Go project. In writing this book, what I've done is go and say "how can I write the absolutely minimum PPU so that the most basic NROM games will play (like Donkey Kong & Tennis)." My rewritten from scratch PPU is as simple as possible, only doing any rendering once per frame (the simplified PPU only does anything 60 times per second). I have it working well on very simple commercial and public domain games in C. Now my challenge is to get it ported and performant enough in pure Python, the language of the book.
Is it possible to sync python up enough to get it to write frames accurately without a significant amount of lag? I'd certainly be interested in how to get it to do that.
Off topic, but I wanted to tell you that I'm enjoying going through 'Classic Computer Science Problems In Python.' I think you're a very good writer and I'm looking forward to more of your work.
Do you know if any tutorial exists for SNES emulators? Haven’t been able to find one, and I’ve heard it’s a large step up in difficulty from a NES emulator.
I don't know SNES. I went from writing a really basic NES emulator to working on a really basic IBM PC emulator. If you want to know more about SNES, I would look at old posts on the subreddit emudev:
https://www.reddit.com/r/EmuDev/
Apple ][s have pretty much dumb framebuffer graphics and plenty of games were written for those. Fairly sure the ZX Spectrum didn't have much in the way of dedicated graphics hardware either.
Apple II graphics were fucking weird. It was a framebuffer, true -- but the framebuffer layout was not even remotely linear, as all of the system's video output modes were designed to piggyback on the "free" memory read operations performed by DRAM refresh. Colors were even stranger. This all added up to a system which was difficult to write high-performance graphics code for.
If you want to understand just how weird Apple II graphics were, a great introduction is Stephen Edwards's "Inside the Apple II" video: https://youtu.be/r1VlrJboDMw. It's just half an hour long but manages to explain the clever hacks that Woz employed to squeeze video and floppy-disk logic from inexpensive 1970s-era hardware. (It also explains the Apple II's switching power supply, which was novel at the time.)
Heh yeah, I spent a silly amount of time writing my own bitblts for the thing as a kid. I think the //e version of one of the books mentioned in the video (Understanding the Apple ][) is where I first learned of vertical retrace/blank.
It's not linear but it's modulo-ish and lookup-table-able. The colour is definitely mildly weird too but again, I think part of it is that it looks so unfamiliar in hindsight. I don't think these things that had much of an impact on high-performance code compared to 'it's just a blob of memory and every single thing has to be done by the anemic CPU'.
When you're writing in 6502 assembler, the weird layout actually turns out to be easier to program against then the Commodore approach where they arranged the lines consecutively. It turns out that doing a multiply by 40 operation is a bit tricky in that context. I am pretty sure that I could still write 6502 assembly to do 40-column graphics access on the Apple ][ if I had to. The 80-column mode stuff I'm less confident about since I don't remember if the memory banking was left-half/right-half or odd/even.
Having recently coded 2 demos on the Apple2 [*], I can assure you that the "modulosih-lookup-table2" is a nightmare to work with. Plenty of unwanted color artifact + the modulo and table lookup cost a hell of a lot of CPU cycles (the Apple has no dedicated graphics chip)...
Very cool. But it's also a good answer to the GP and some of the downthread colour questions - if push comes to shove, you can just ignore the colour. You didn't link it so I'll take the liberty: github repo is at
Again, I don't think that was really the limiting factor to performance - it can be addressed by pre-computation and other trickery. It's worth remembering it took until the beginning of the 90s for something like a mario-style side scrolling game to be possible on a PC, with a lot of clever hacking, a much more powerful CPU and slightly more programmer-accessible (but still largely dumb framebuffer) video hardware.
One can choose to store the data in weird ways that map to the screen, or use table lookups. From there, unrolled code can go damn fast too, and in all those scenarios, RAM is used to cope with goofy addressing.
Then comes 7 pixels per byte!
On the plus side, doing that, plus the high bit shifting pixels a little to provide a basic color attribute bit, meant getting a 6 color display instead of a 4 color one.
6 colors is enough to do basically anything. 4 is not quite enough.
But, that also means having to either shift data prior to blitting it to the screen (slower), or preshifting (faster), and more preshifted data copies are needed because the color by bit position repeats every two bytes, not every byte like pretty much all the other 1bpp + artifact, or 2bpp systems needed.
If one has the RAM, fact is the Apple did not have screen DMA wait states slowing the CPU down. That meant getting the best of 1Mhz.
It also made the system easy to accelerate too.
A 2 to 4Mhz Apple performs very well on software sprites.
But even the stock machine could deliver quite a bit more than one might expect, given things could fit into RAM.
4Mhz can deliver a side scroller on par, BTW. But, that would still be a RAM challenge to hold the diversity of images seen in something like Keen.
True. 2600 sprites work like this too. "Specky" has a Z80 anyway. I can't imagine realistically targeting something like this or being nostalgic for that XOR'd sprite look. Most later machines had more / varied graphical capabilities, even the new Commander X16 which uses a 'c02 has it's own graphics processor and several formats.
The 2600 was a whole wacky world of its own. The system didn't actually have any native support for sprites or bitmapped graphics! The graphics chip was designed to implement games similar to Pong, and only natively supported drawing a few blocks of color; games which wanted to display more complex graphics (which is to say, almost all of them) had to abuse this hardware by modifying its registers on a per-scanline (or even an intra-scanline) basis.
The Atari 2600 supported sprites: two "players", two "missiles", and one "ball", that could be moved freely over the low resolution playfield. However, their bitmaps were only one pixel high; you had to change them on each scanline in order to get taller images.
Rectangular bitmap sprites may have debuted in consumer hardware with the TI-99/4 in 1979; the term "sprite" comes from the technical documentation for that computer.
Right, all you could do with the "player" objects was point them at a different byte every scanline, write the X value into the register, and it would paint the "1's" as "pixels" in that byte on that particular scanline with that X value. It was up to the programmer to multiplex them on a per scanline basis. No framebuffer whatsoever. Fun stuff!
The emulator doesn't work in Firefox because of a syntax error, await in the top level isn't supported. Remember to test your code in more than one browser.