Let’s write a Chip8 emulator in Kotlin [Part 1: A simple Disassembler]

Disclaimer

This series is a means for me to learn Kotlin. As such i might misuse some features of Kotlin, not follow best practices or simply to silly things. Please always check the comment section for feedback from more knowledgable people. I’ll also have follow up articles where i refactor the code to be more idiomatic. So, make sure to check back.

Goals

Last time we setup our Kotlin project proper. This time we are going to write a simple disassembler for Chip8 ROMs!

What is Chip8?

Chip8 is a sort of virtual machine/interpreter for a simplistic instruction set from the mid-70ies, developed by Joseph Weisbecker. I won’t bore you with the history of Chip8, Wikipedia has a pretty good summary. There are a handful of games for the Chip8 which come in form of ROMs containing the raw instructions and data. E.g. a simple breakout clone called Brix:

Chip8 is conceptually composed of a few parts:

  • Memory: Chip8 has 4096 bytes of memory, linearly addressable. Byte 0x000-0x1FF contained the original interpreter and bitmaps for rendering text. Program code is stored from 0x200-0xE9F. Memory from 0xEA0-0xEFF is reserved for the call stack, internal use and other variables. Memory from 0xF00-0xFFF is reserved for the display buffer.
  • Registers: Chip8 has 16 8-bit general purpose registers called V0 to VF. The VF register is also used as a carry flag for operations such as additions or shifts. Additionally there’s a 16-bit address register called I and a programm counter called PC which are manipulated indirectly. There’s also a stack pointer called SP which can not be directly accessed.
  • Stack: only used to store return addresses when subroutines are called. The stack is not stored in memory, but in form of a 16-element array of 16-bit values, storing return addresses. This limits Chip8 to a callstack depth of 16.
  • Timers: Chip8 has two timers called delay timer and sound timer. Both get decremented every 16ms (60Hz) until they reach 0. The delay timer is used to time events in the application. The sound timer will play a sound when it reaches 0.
  • Keyboard: Chip8 uses a 16-key keypad (see here). Special instructions exist to read the keystates and wait for keypresses.
  • Display: Chip8 has a 64×32 pixel monochrome display, mapped to memory area 0xF00-0xFFF which i’ll refer to as display buffer. Each pixel is represented as one bit in the display buffer. Graphics are drawn via sprites. A sprite is a group of bytes, each bit representing a pixel (0->off, 1->on). All sprites are 8-pixels in width and 1-15 pixels in height. Each row in a sprite is hence one byte wide. Special instructions exist to clear the screen and draw sprites referenced via memory addresses. Sprites are drawn in XOR mode. When an existing bit in the display buffer gets erased by drawing a sprite, the VF is set to 1. This is used for collision detection in many games.
  • Instructions: Chip8 has 35 instructions (opcodes), each two bytes (16-bit) in size. The most significant byte is stored first. They allow simple control flow, 8-bit arithmetic and bit manipulation and interacting with timers, the keypad and the display.

Cowgod put together an excellent reference for the Chip8 architecture which i’ll use as the gold standard. You can find emulators and ROMs around the web.

Storing the Chip8 VM State

Before we dive into any kind of opcodes, we should think about how we want to keep the state of our little Chip8 VM. We need:

  • Memory. We’ll initialize this with built-in font data and copy the contents of a ROM file there starting at address 0X200.
  • Registers. We have 16 general-purpose registers V0-VF, each 8-bits wide. In addition we have PC and I, each 16-bit wide. Only the lower 12-bit are used as the address space is limited to 0x000-0xFFF. We also have a stack pointer which points at the current stack location.
  • Stack. We need 16 16-bit wide stack elements to store return addresses
  • Timers. We need to store the delay and sound timer values, both 8-bit wide
  • Keypad. We need to store the state of each of the 16 keys, a byte will do per key

This can be easily translated to a data class in Kotlin. We won’t add any behaviour to that class, it will only store state (VmState.kt):

Short and sweet. Notice the use of val for reference types like the ram or registers, and var for things that we need to mutate, like pc. Kotlin will automatically generate properties with appropriate getters (for var and val) and setters (for var properties) based on the parameters of the constructor. We also use default values to initialize our properties.

We also use Ints instead of all registers and timers, except the general purpose registers and the keypad states.

Loading a ROM

For loading a ROM file, we simply read the contents of the file into our VmState#ram, starting at address 0x200. We create a top-level function in our chip8 package (main.kt):

Kotlin Note: is there a better way to do this?

This function uses the Closable#use extension method defined by the Kotlin standard library. It will automatically close the Closable it is used on, whether an exception is thrown in the provided code block or not. This is kinda similar to Java’s try-with-resources. Note that the last expression in the block makes up the return value of the use method, which is then returned by loadRom. Very concise.

To make this code run, i had to copy a few ROMs to the project under the roms/ folder. Executing this app leaves us with a VmState that has the ROM contents loaded at address 0x200 within the state’s ram. The app simply outputs the first byte of the loaded ROM, which is 96.

Decoding the program

Once our ROM is loaded, we can start dissecting it. In this article we are going to write a simple disassembler that outputs the opcodes to the console. Let’s take a look at the instruction set (taken from http://mattmik.com/chip8.html by Matthew Mikolay).

Opcode Description
0NNN Execute machine language subroutine at address NNN
00E0 Clear the screen
00EE Return from a subroutine
1NNN Jump to address NNN
2NNN Execute subroutine starting at address NNN
3XNN Skip the following instruction if the value of register VX equals NN
4XNN Skip the following instruction if the value of register VX is not equal to NN
5XY0 Skip the following instruction if the value of register VX is equal to the value of register VY
6XNN Store number NN in register VX
7XNN Add the value NN to register VX
8XY0 Store the value of register VY in register VX
8XY1 Set VX to VX OR VY
8XY2 Set VX to VX AND VY
8XY3 Set VX to VX XOR VY
8XY4 Add the value of register VY to register VX
Set VF to 01 if a carry occurs
Set VF to 00 if a carry does not occur
8XY5 Subtract the value of register VY from register VX
Set VF to 00 if a borrow occurs
Set VF to 01 if a borrow does not occur
8XY6 Store the value of register VY shifted right one bit in register VX
Set register VF to the least significant bit prior to the shift
8XY7 Set register VX to the value of VY minus VX
Set VF to 00 if a borrow occurs
Set VF to 01 if a borrow does not occur
8XYE Store the value of register VY shifted left one bit in register VX
Set register VF to the most significant bit prior to the shift
9XY0 Skip the following instruction if the value of register VX is not equal to the value of register VY
ANNN Store memory address NNN in register I
BNNN Jump to address NNN + V0
CXNN Set VX to a random number with a mask of NN
DXYN Draw a sprite at position VX, VY with N bytes of sprite data starting at the address stored in I
Set VF to 01 if any set pixels are changed to unset, and 00 otherwise
EX9E Skip the following instruction if the key corresponding to the hex value currently stored in register VX is pressed
EXA1 Skip the following instruction if the key corresponding to the hex value currently stored in register VX is not pressed
FX07 Store the current value of the delay timer in register VX
FX0A Wait for a keypress and store the result in register VX
FX15 Set the delay timer to the value of register VX
FX18 Set the sound timer to the value of register VX
FX1E Add the value stored in register VX to register I
FX29 Set I to the memory address of the sprite data corresponding to the hexadecimal digit stored in register VX
FX33 Store the binary-coded decimal equivalent of the value stored in register VX at addresses I, I+1, and I+2
FX55 Store the values of registers V0 to VX inclusive in memory starting at address I. I is set to I + X + 1 after operation
FX65 Fill registers V0 to VX inclusive with the values stored in memory starting at address I. I is set to I + X + 1 after operation

Opcodes are given in hexadecimal and are 2 bytes (16-bit) wide. Here’s a legend for the non-hexadecimal characters in the opcodes:

NNN A 12-bit address, 0x000-0xFFF
NN An 8-bit value, 0x00-0xFF
X A 4-bit register index, 0x0-0xF
Y A 4-bit register index, 0x0-0xF, usually the second register used by an opcode

The most significant nibble (4-bits) of a 16-bit opcode signifies the operation (or operation group, e.g. the 8xxx instructions). That nibble is followed by one or more arguments, such as addresses to jump to or registers to be used (identified by the registers index).

A few decoding examples (the mnemonics are made up and not standard):

1200 jmp 0x200 Jump to address 0x200
6523 str v5, 0x23 Store value 0x23 in register V5
9340 jneqr v3, v4 Skip the next instruction if V3 and V4 are not equal

Not all bytes in a ROM represent instructions. Bytes can also represent data in a ROM, such as sprites. We are going to ignore this possibility for now.

As we can see from the decoding examples above, we need to be able to operate on 4-bit, 8-bit and 16-bit values. Let's define some extension methods in a new file called Extensions.kt for Int and Byte:

The first two extension methods will be attached to Int and Byte. They let us convert those types to a hex string. The Byte#high and Byte#low methods allow us to access the upper and lower nibble of a byte. This will be useful when we want to extract register indices and other nibbles from an opcode. The final function allows us to concatenate two bytes (msb=most significant byte, lsb=least significant byte) of an opcode and get a 12-bit wide address out of it, e.g. for the ANNN opcode.

How to structure the decoding? We shouldn't really care where the two bytes of an opcode came from, so let's create a function that can decode a single opcode given as msb and lsb:

What should this function do? It's main purpose is to "parse" the opcode and perform an action dependent on that opcode. For a disassembler, we want to print the mnemonics to the console or a string. For an interpreter we want to advance the VM state in accordance to the instruction being executed.

The function should thus merely decipher the instruction and its parameters and delegate the actual operation to something else. Let's call that something else a Decoder. For each of our opcodes, the Decoder has a corresponding method that will get invoked by the decode function. Let's make the Decoder a trait (see Traits) (Decoder.kt):

The first two methods are special: before is called before the opcode is decoded, passing in the opcode (as an Int, only the lower 16-bit are used) and the address of the opcode in VmState.ram. unknown is called when the decoder encounters an opcode it does not know.

The rest of the methods map to the opcodes in the table above. The decode function parses the opcode and passes it to the Decoder (main.kt):

We can use Kotlin's when expression to perform a switch on the opcode. Notice the use of our extension methods on Int and Byte to easily access the information we want to pass on to the Decoder.

Kotlin note: can this be done better?

A simple Disassembler

Writing our disassembler is easy: we simply the Decoder trait and then loop over all opcodes in the VmState.ram. Here's the Disassembler (Disassembler.kt):

We append the textual translation of the opcode to a StringBuilder. Once we decoded all opcodes, we can call code>Disassembler#toString to get the textual representation of the program. Notice the use of Kotlin’s string templating.

I added an extension method for StringBuilder to our Extensions.kt file to make appending new lines easier (Extensions.kt):

The only thing left to do is to loop through the program’s code and let the Decoder do its thing. But there’s one problem: when we read the ROM into a VmState, we lose the information about the programs size. If we simply iterated over all opcodes in VmState.ram starting at 0x200, we’d output a lot of garbage. Let’s fix this by adding the program size to the VmState and setting it in loadRom:

(VmState.kt)

(main.kt)

Note the use of Kotlin’s named arguments feature in line 4 of loadRom.

Now we are ready to write our disassemble function. I decided to put it in Disassembler.kt as a top-level function, much like our loadRom function in main.kt. We could of course just add it to Disassembler as well.

(Disassembler.kt)

And for good measure, let’s try our disassembler in main (main.kt):

Which gives us the following output for maze.rom:

Thankfully we also have the source code of maze.rom, so we can compare the output of the disassembler to it:

That seems to match up pretty nicely, except for the opcodes starting at address 0x21e. In the original source, those bytes are actually data (sprites to be exact). Our disassembler doesn’t know that those bytes aren’t instructions, so it tries its best to decode them.

This is a problem we’ll have to resolve next time. It turns out that when writing an interpreter, we’ll never really encounter those bytes as instructions. Due to jumps in the code, the interpreter will never reach those bytes. Pretty simple fix.

However, if we wanted to write an ahead-of-time compiler that translates the Chip8 instructions to Java bytecode or machine code, this is going to be a problem. For that we’ll need to figure out the basic blocks of our program. And even then, we can’t be 100% sure, as Chip8 has non-absolute jump instructions based on register contents.

Up Next

We can now load a ROM and disassemble it (naively). Next time we are going to refactor things a little to be more idiomatic.

Happy coding!

Previously

Let’s write a Chip8 emulator in Kotlin [Part 0: Motivation & Setup]

Following Along

  1. Install the required tools (JDK, IDEA, IDEA Kotlin plugin, Git)
  2. Clone the repo: git clone https://github.com/badlogic/chip8.git
  3. Checkout the tag for the article you want to work with: git checkout part1
  4. Import the project into IDEA (Open Project, select build.gradle file, select "Use customizable gradle wrapper"

Let’s write a Chip8 emulator in Kotlin [Part 0: Motivation & Setup]

Disclaimer

This series is a means for me to learn Kotlin. As such i might misuse some features of Kotlin, not follow best practices or simply to silly things. Please always check the comment section for feedback from more knowledgable people. I’ll also have follow up articles where i refactor the code to be more idiomatic. So, make sure to check back.

Why?

I recently started looking into alternative JVM languages. After a short interaction with Scala, i looked into JetBrains’ Kotlin. When learning a new language/platform, i usually write a simple application that allows me to iteratively explore new concepts and features. For my adventure in Kotlin i chose to write a Chip8 emulator.

My goals with this little series are as follows:

  • Learn a bit of Kotlin
  • Document my journey
  • Get feddback from Kotlin users
  • Show how to write a simple emulator

My non-goals are:

  • Teach anyone Kotlin, use the official docs if you want to learn it all
  • Write the most efficient and precise Chip8 emulator

Let’s get going.

Setting up a Kotlin project

As stated earlier, i’m not going to try and teach anyone Kotlin given my knowledge level. Instead i’ll comment on the tooling surrounding Kotlin. What i present may not be best practices but it’s what has worked for me.

The project i’m going to set up has to fulfil these requirements:

  1. Build management and packaging
  2. Dependency management
  3. IDE integration
  4. Version control

Requirements 1 and 2 can be achieved by choosing one or combining multiple of the following tools: Ant, Ivy, Maven, Gradle, SBT and so on. I’m going to use Gradle as there exists a first class Gradle plugin for Kotlin. It’s also quite a bit less verbose than Maven and hence suitable for a textual series like this.

Being a brain-child of JetBrains, Kotlin has first class support in Intellij IDEA, so that’s what i’m going to use. Quite a change for an Eclipse guy like me.

On the version control front i’m going to go with Git, hosting a repository on Github.

Given these choices, i need to install the following:

  • Latest JDK (make sure the bin/ folder is in your PATH)
  • A Git client (TortoiseGit on Windows, on Linux use your package manager, on Mac OS X use homebrew to get the latest and greatest)
  • Intellij IDEA 13 (community edition will do)
  • Gradle (make sure the bin/ folder is in your path)

Project Structure, Build and Dependency Management

Let’s start by setting up a few folders and files for our project. We’ll add a build.gradle file describing our Gradle build first:

The buildscript block is required to pull in the Kotlin Gradle plugin (line 6). Line 10 and 11 specify that this project uses both the Kotlin and Java plugin, allowing us to mix and match the two languages if required. Next we define the repositories from where dependencies are fetched. For now, the only required dependency is the Kotlin standard library (line 18). Finally i added two tasks (run, dist) to run the project and package it into a JAR for distribution. Note line 21 where i define the main class. That class name is used by the run and dist to execute our app and package our app as a runnable JAR respectively.

The wrapper task pulls the Gradle wrapper files into our project structure. This will allow other people to work with the project without installing Gradle themselves.

The last missing bit is a source folder and main entry point. Per convention, the Kotlin source files are placed under src/main/kotlin, Java source files are put into src/main/java. Tests go into src/test/kotlin and src/test/java respectively. I put a very simple Hello-World style Kotlin app in src/main/kotlin/chip8/main.kt:

Where is the Chip8Package class we defined as the main class in the build.gradle file? Turns out that Kotlin will put all top-level functions of a package into a synthetic class called <Packagename>Package. We therefor have to specify that synthetic class for the JVM to run our main function.

Before moving on, we’ll invoke the wrapper task to pull the Gradle wrapper into our project:

The wrapper is composed of two script files (gradlew for *nix and gradlew.bat for Windows) and a folder called gradle/ which contains a tiny JAR and a properties file. From now on, we’ll invoke the script files (gradlew, gradlew.bat) in our project directory.

Running the app on the command line is as simple as calling the run task:

Note that we now use the gradlew script instead of our local Gradle installation. The first time the wrapper is invoked, it downloads the Gradle version we specified in the wrapper task and installs it. This may take some time but won’t be repeated on subsequent invocations.

We can also package our app as a runnable JAR via the dist task and run that JAR:

The project layout now looks like this:

IDE Integration

Before we can import our project into Intellij IDEA, we need to install the IDEA Gradle plugin. Fire up IDEA, on the welcome screen go to Configure->Plugins, click on Install JetBrains plugin, search for Kotlin and install the plugin.

We can now import the project by clicking on Open Project on the welcome screen, navigating to the project folder and select the build.gradle file. IDEA will ask us some specifics about our Gradle project. Make sure to select Use customizable gradle wrapper, so IDEA uses our wrapper properly.

Once loaded, we can run our main function. The simplest way to do this is to open up the main.kt file in IDEA, right click and select Run 'chip8.Chip8Package':

This will also create a configuration for us which we can use to launch and debug the app later on. Speaking of debugging, it works as you expect. Simply create a breakpoint in a source file in the line number margin on the left and start your app in debug mode:

Version Control

Time to push our master piece to Github. I created a Github repository called chip8 which you can follow along. Since we already have a folder, we can initialize the repository with a few magic incantations on the command line:

We should also create a .gitignore file so we don’t commit the build directory, IDEA project files and so forth:

Let’s add our initial set of files, commit and push them to the remote repository:

Let’s also tag this state:

And we are done!

Up Next

You can now simply clone the repository and follow along the rest of the series, provided you installed all the tools we talked about earlier.

In the next article we’ll write a simple Chip8 ROM disassembler.

Following Along

  1. Install the required tools (JDK, IDEA, IDEA Kotlin plugin, Git)
  2. Clone the repo: git clone https://github.com/badlogic/chip8.git
  3. Checkout the tag for the article you want to work with: git checkout part0
  4. Import the project into IDEA (Open Project, select build.gradle file, select "Use customizable gradle wrapper"

Happy coding!

libGDX wins Duke’s Choice Award

Damn right. The Duke’s Choice Award is an anual thing given out by a jury of judges as well as the community to projects from the Java realm based on various criteria. It’s one of the biggest honors you can get in the Java world. Here’s the Java Magazine article on the winners, including my ugly face, Robotality’s Halfway and Interrupt’s Delver (which you should both buy and play the hell out of):

The story behind that photo is actually quite funny. Oracle send a photographer from the Netherlands to my place in Graz, via car. He took about 500 images. I felt very special that day…

Thanks to all the contributors, this award goes out to all of you (you can put it on your resume :D).