Hacker News new | past | comments | ask | show | jobs | submit login
Elite's crazy tokenized string routine (xania.org)
88 points by luu on Jan 27, 2015 | hide | past | favorite | 17 comments



Memory used to be precious. Last year I ported some text adventures games from the TRS-80 machines of the early 1980s to Android. While working on one of the games (Bedlam), I went through a process very similar to the article. Looked for strings that I knew had to be there in some form, ended up having to use a step debugger (the one in MESS) to trace through the code and discover the print routines and then the magic behind them.

It turned out Bedlam packed strings into contiguous bytes of ram, but using 5 bits per character. Byte 1 had all of character 1 in the lower 5 bits and the first 3 bits of character 2 in the upper 3 bits, and so on. Of course with only 32 values the character set was a bit odd. As I recall you had most letters but not Z or Q,the numbers 0,1, and 3, a period, comma, and space. Something like that. The routine to unpack the characters was quite small, just used shifts to build an offset into a table containing the ASCII values.

The games also ran in a somewhat ingenious "virtual machine", with all the game logic being expressed through sets of very tightly encoded high level rules rather than implemented directly in assembly. In the Android port, I literally load the original ROM into an array and then just process these same rules verbatim using a java implementation of the VM. Kind of amazing to me how portable yet efficient the design is.


Using a VM was common for text adventures of the time; check out the Z-Machine used by Infocom games: http://inform-fiction.org/zmachine/standards/

Like this game you describe here, the Z-Machine also stored characters in less than one byte, packing three characters (plus an extra control bit) into two bytes (http://inform-fiction.org/zmachine/standards/z1point1/sect03...)


Directly related to the planet name generation routine, if you look closely - the same tokens are used there (although that's not all of it). Part of this legacy carries over to the sequels, even the recently-released Elite: Dangerous, although the "old worlds" are patched in by hand along with a bunch of discovered stars.

A small side-effect, however: Bell & Braben had to try several galaxy seeds in Elite before they happened upon one that didn't generate the planet Arse!

Lots of things of the era did things like this, of course. 8-bit BASICs would often tokenise on input to reduce memory consumption and the amount of lexing needed during the interpreting.

Even as late as the PlayStation, similar things were still very common practice: take a look at the English translations of Final Fantasy 7 through 9, or Chrono Cross, for excellent examples of the type of thing, which (if I recall correctly? It's been a few years!) don't use ASCII (but map to offsets in the tilesets), use control characters to handle colours and the like, sometimes have digraphs and in the case of Chrono Cross, due to a lack of disc space (caused by English text being bigger than Japanese text of an equivalent meaning), the localisation team got highly creative and made an accent engine for the 44 or so different characters so that quite a lot of the lines could be reused and changed on-the-fly (as the 'developer ending' documents).


This link was worth it just to find out jsbeeb exists.

It always surprises me when you see such extreme space saving efforts, especially on machines where you would expect the processing overhead to be high. I've noticed that as the tech improved this concept remains important because I/O bandwidth is so often the real limiting factor. Things like piping input/output through gzip can actually speed batch processes up dramatically.


Yeah, even in-memory processes can be sped up by compression/decompression. When you're not running entirely in cache, which many old games can be done.

Doom 1+2 (including all the data) can be entirely loaded into modern high end CPU cache.


Weird seeing this here, have been playing a lot of Elite: Dangerous lately, if you are a fan of the original Elite it is a true successor: https://www.elitedangerous.com/

Here is a video with interviews of the orginal developers that discuss the extreme need for byte savings: https://www.youtube.com/watch?v=Rapa3VfUWfs

Of course you guys know David Braben is one of principle founders of the Raspberry Pi foundation.


It isn't that crazy when you realize that, on those old systems, every byte counted. If he could save 50-100 bytes with that weird method, it was worth it.


The entire universe was procedurally generated because you couldn't hold all the data in the memory of an early microcomputer.

I have a write up on how it worked here: http://blog.rabidgremlin.com/2015/01/14/procedural-content-g...


I wonder if, as well as saving space and obfuscating the code for hackers, the tokens helped the random name generator produce more pronounceable results, perhaps more so than simply alternating consonant and vowels?


About a third of a percent of the memory available in a Model B.


My first computer was a BBC B Microcomputer when I was about 7 or 8 years old. I played Elite with my brother for about one and a half years. Great memories.

Anyone other Acorners remember Repton? Also a cool game. You had to wait 10 minutes for the game to load off a cassestte though!



That's awesome. Thanks for sharing!


Ex-acorner here. Repton 2 was my favourite game at the time. Elite actually took a back seat to that and inline assembly programming in BASIC and crudely wiring Lego cranes and trains to it. A friend and myself were the two man demo team on school opening evenings.

The beeb was and still is the peak of computing for me. I capitalised on lots of them being chucked out from my school in the early to mid 1990s. Must have had about 30 of them and piles of disk drives and hundreds of disks but alas they're all gone via ebay now.

Occasionally I fire up BeebEm but that's it now :(


Still have my 31 year old beeb working. That said, the PSU has had a lot of work on it and there's a ton of heatsinks added, but it's stable for 1-2 hours before crashing.

Can't get disks or cassettes to load (still got all the Acornsoft and Superior stuff), but I plan on linking it up to something like a Pi and add a custom file system to be able to load stuff up again. Interested if anyone already has mananged this.


Check out GoSDC: http://web.inter.nl.net/users/J.Kortink/home/hardware/gosdc/...

Also you can now get ethernet! (BBC Master 128 only) http://www.sprow.co.uk/bbc/masternet.htm


The C64 did not use ASCII. It has upper and lower exchanged.

My operating system uses LZW compression.

http://www.templeos.org/Wb/Kernel/Compress.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: