Disclaimer
This series is a means for me to learn Kotlin. As such i might misuse some features of Kotlin, not follow best practices or simply to silly things. Please always check the comment section for feedback from more knowledgable people. I’ll also have follow up articles where i refactor the code to be more idiomatic. So, make sure to check back.
Goals
Last time we setup our Kotlin project proper. This time we are going to write a simple disassembler for Chip8 ROMs!
What is Chip8?
Chip8 is a sort of virtual machine/interpreter for a simplistic instruction set from the mid-70ies, developed by Joseph Weisbecker. I won’t bore you with the history of Chip8, Wikipedia has a pretty good summary. There are a handful of games for the Chip8 which come in form of ROMs containing the raw instructions and data. E.g. a simple breakout clone called Brix:

Chip8 is conceptually composed of a few parts:
- Memory: Chip8 has 4096 bytes of memory, linearly addressable. Byte
0x000-0x1FF
contained the original interpreter and bitmaps for rendering text. Program code is stored from0x200-0xE9F
. Memory from0xEA0-0xEFF
is reserved for the call stack, internal use and other variables. Memory from0xF00-0xFFF
is reserved for the display buffer. - Registers: Chip8 has 16 8-bit general purpose registers called
V0
toVF
. TheVF
register is also used as a carry flag for operations such as additions or shifts. Additionally there’s a 16-bit address register calledI
and a programm counter calledPC
which are manipulated indirectly. There’s also a stack pointer calledSP
which can not be directly accessed. - Stack: only used to store return addresses when subroutines are called. The stack is not stored in memory, but in form of a 16-element array of 16-bit values, storing return addresses. This limits Chip8 to a callstack depth of 16.
- Timers: Chip8 has two timers called delay timer and sound timer. Both get decremented every 16ms (60Hz) until they reach 0. The delay timer is used to time events in the application. The sound timer will play a sound when it reaches 0.
- Keyboard: Chip8 uses a 16-key keypad (see here). Special instructions exist to read the keystates and wait for keypresses.
- Display: Chip8 has a 64×32 pixel monochrome display, mapped to memory area
0xF00-0xFFF
which i’ll refer to as display buffer. Each pixel is represented as one bit in the display buffer. Graphics are drawn via sprites. A sprite is a group of bytes, each bit representing a pixel (0->off, 1->on). All sprites are 8-pixels in width and 1-15 pixels in height. Each row in a sprite is hence one byte wide. Special instructions exist to clear the screen and draw sprites referenced via memory addresses. Sprites are drawn in XOR mode. When an existing bit in the display buffer gets erased by drawing a sprite, theVF
is set to 1. This is used for collision detection in many games. - Instructions: Chip8 has 35 instructions (opcodes), each two bytes (16-bit) in size. The most significant byte is stored first. They allow simple control flow, 8-bit arithmetic and bit manipulation and interacting with timers, the keypad and the display.
Cowgod put together an excellent reference for the Chip8 architecture which i’ll use as the gold standard. You can find emulators and ROMs around the web.
Storing the Chip8 VM State
Before we dive into any kind of opcodes, we should think about how we want to keep the state of our little Chip8 VM. We need:
- Memory. We’ll initialize this with built-in font data and copy the contents of a ROM file there starting at address
0X200
. - Registers. We have 16 general-purpose registers
V0-VF
, each 8-bits wide. In addition we have PC and I, each 16-bit wide. Only the lower 12-bit are used as the address space is limited to0x000-0xFFF
. We also have a stack pointer which points at the current stack location. - Stack. We need 16 16-bit wide stack elements to store return addresses
- Timers. We need to store the delay and sound timer values, both 8-bit wide
- Keypad. We need to store the state of each of the 16 keys, a byte will do per key
This can be easily translated to a data class
in Kotlin. We won’t add any behaviour to that class, it will only store state (VmState.kt
):
1 2 3 4 5 6 7 8 9 10 11 |
package chip8; data class VmState (val ram: ByteArray = ByteArray(4096), val registers: ByteArray = ByteArray(16), var pc: Int = 0x200, var index: Int = 0, var sp: Int = 0, val stack: IntArray = IntArray(16), var delay: Int = 0, var sound: Int = 0, val keys: ByteArray = ByteArray(16)) |
Short and sweet. Notice the use of val
for reference types like the ram or registers, and var
for things that we need to mutate, like pc
. Kotlin will automatically generate properties with appropriate getters (for var
and val
) and setters (for var
properties) based on the parameters of the constructor. We also use default values to initialize our properties.
We also use Ints
instead of all registers and timers, except the general purpose registers and the keypad states.
Loading a ROM
For loading a ROM file, we simply read the contents of the file into our VmState#ram
, starting at address 0x200
. We create a top-level function in our chip8 package (main.kt
):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
package chip8; import java.io.FileInputStream import java.io.BufferedInputStream import java.io.DataInputStream fun loadRom(file: String): VmState { return DataInputStream(BufferedInputStream(FileInputStream(file))).use { val state = VmState() val rom = it.readBytes() System.arraycopy(rom, 0, state.ram, state.pc, rom.size) state } } fun main(args: Array<String>) { val vmState = loadRom("roms/maze.rom") println(vmState.ram[0x200]) } |
Kotlin Note: is there a better way to do this?
This function uses the Closable#use
extension method defined by the Kotlin standard library. It will automatically close the Closable
it is used on, whether an exception is thrown in the provided code block or not. This is kinda similar to Java’s try-with-resources. Note that the last expression in the block makes up the return value of the use
method, which is then returned by loadRom
. Very concise.
To make this code run, i had to copy a few ROMs to the project under the roms/
folder. Executing this app leaves us with a VmState
that has the ROM contents loaded at address 0x200
within the state’s ram
. The app simply outputs the first byte of the loaded ROM, which is 96
.
Decoding the program
Once our ROM is loaded, we can start dissecting it. In this article we are going to write a simple disassembler that outputs the opcodes to the console. Let’s take a look at the instruction set (taken from http://mattmik.com/chip8.html by Matthew Mikolay).
Opcode | Description |
---|---|
0NNN |
Execute machine language subroutine at address NNN |
00E0 |
Clear the screen |
00EE |
Return from a subroutine |
1NNN |
Jump to address NNN |
2NNN |
Execute subroutine starting at address NNN |
3XNN |
Skip the following instruction if the value of register VX equals NN |
4XNN |
Skip the following instruction if the value of register VX is not equal to NN |
5XY0 |
Skip the following instruction if the value of register VX is equal to the value of register VY |
6XNN |
Store number NN in register VX |
7XNN |
Add the value NN to register VX |
8XY0 |
Store the value of register VY in register VX |
8XY1 |
Set VX to VX OR VY |
8XY2 |
Set VX to VX AND VY |
8XY3 |
Set VX to VX XOR VY |
8XY4 |
Add the value of register VY to register VX Set VF to 01 if a carry occurs Set VF to 00 if a carry does not occur |
8XY5 |
Subtract the value of register VY from register VX Set VF to 00 if a borrow occurs Set VF to 01 if a borrow does not occur |
8XY6 |
Store the value of register VY shifted right one bit in register VX Set register VF to the least significant bit prior to the shift |
8XY7 |
Set register VX to the value of VY minus VX Set VF to 00 if a borrow occurs Set VF to 01 if a borrow does not occur |
8XYE |
Store the value of register VY shifted left one bit in register VX Set register VF to the most significant bit prior to the shift |
9XY0 |
Skip the following instruction if the value of register VX is not equal to the value of register VY |
ANNN |
Store memory address NNN in register I |
BNNN |
Jump to address NNN + V0 |
CXNN |
Set VX to a random number with a mask of NN |
DXYN |
Draw a sprite at position VX, VY with N bytes of sprite data starting at the address stored in I Set VF to 01 if any set pixels are changed to unset, and 00 otherwise |
EX9E |
Skip the following instruction if the key corresponding to the hex value currently stored in register VX is pressed |
EXA1 |
Skip the following instruction if the key corresponding to the hex value currently stored in register VX is not pressed |
FX07 |
Store the current value of the delay timer in register VX |
FX0A |
Wait for a keypress and store the result in register VX |
FX15 |
Set the delay timer to the value of register VX |
FX18 |
Set the sound timer to the value of register VX |
FX1E |
Add the value stored in register VX to register I |
FX29 |
Set I to the memory address of the sprite data corresponding to the hexadecimal digit stored in register VX |
FX33 |
Store the binary-coded decimal equivalent of the value stored in register VX at addresses I, I+1, and I+2 |
FX55 |
Store the values of registers V0 to VX inclusive in memory starting at address I. I is set to I + X + 1 after operation |
FX65 |
Fill registers V0 to VX inclusive with the values stored in memory starting at address I. I is set to I + X + 1 after operation |
Opcodes are given in hexadecimal and are 2 bytes (16-bit) wide. Here’s a legend for the non-hexadecimal characters in the opcodes:
NNN |
A 12-bit address, 0x000-0xFFF |
NN |
An 8-bit value, 0x00-0xFF
|
X |
A 4-bit register index, 0x0-0xF |
Y |
A 4-bit register index, 0x0-0xF , usually the second register used by an opcode |
The most significant nibble (4-bits) of a 16-bit opcode signifies the operation (or operation group, e.g. the 8xxx instructions). That nibble is followed by one or more arguments, such as addresses to jump to or registers to be used (identified by the registers index).
A few decoding examples (the mnemonics are made up and not standard):
1200 |
jmp 0x200 | Jump to address 0x200 |
6523 |
str v5, 0x23 | Store value 0x23 in register V5 |
9340 |
jneqr v3, v4 | Skip the next instruction if V3 and V4 are not equal |
Not all bytes in a ROM represent instructions. Bytes can also represent data in a ROM, such as sprites. We are going to ignore this possibility for now.
As we can see from the decoding examples above, we need to be able to operate on 4-bit, 8-bit and 16-bit values. Let's define some extension methods in a new file called Extensions.kt
for Int
and Byte
:
1 2 3 4 5 6 7 |
package chip8; fun Int.toHex() = Integer.toHexString(this) fun Byte.toHex() = Integer.toHexString(this.toInt()) fun Byte.high() = (this.toInt() and 0xf0) shr 4 fun Byte.low() = this.toInt() and 0xf fun address(msb: Byte, lsb: Byte) = ((msb.toInt() and 0xf) shl 8) or (lsb.toInt() and 0xff) |
The first two extension methods will be attached to Int
and Byte
. They let us convert those types to a hex string. The Byte#high
and Byte#low
methods allow us to access the upper and lower nibble of a byte. This will be useful when we want to extract register indices and other nibbles from an opcode. The final function allows us to concatenate two bytes (msb=most significant byte, lsb=least significant byte) of an opcode and get a 12-bit wide address out of it, e.g. for the ANNN
opcode.
How to structure the decoding? We shouldn't really care where the two bytes of an opcode came from, so let's create a function that can decode a single opcode given as msb and lsb:
1 2 3 |
fun decode(msb: Byte, lsb: Byte) { // TBD } |
What should this function do? It's main purpose is to "parse" the opcode and perform an action dependent on that opcode. For a disassembler, we want to print the mnemonics to the console or a string. For an interpreter we want to advance the VM state in accordance to the instruction being executed.
The function should thus merely decipher the instruction and its parameters and delegate the actual operation to something else. Let's call that something else a Decoder
. For each of our opcodes, the Decoder
has a corresponding method that will get invoked by the decode
function. Let's make the Decoder
a trait
(see Traits) (Decoder.kt
):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
package chip8; trait Decoder { fun before(opCode: Int, address: Int) fun unknown(opCode: Int, address: Int) fun clear() fun ret() fun jmp (address: Int) fun call(address: Int) fun jeq (reg: Int, value: Int) fun jneq (reg: Int, value: Int) fun jeqr (reg1: Int, reg2: Int) fun set (reg: Int, value: Int) fun add (reg: Int, value: Int) fun setr (reg1: Int, reg2: Int) fun or (reg1: Int, reg2: Int) fun and (reg1: Int, reg2: Int) fun xor (reg1: Int, reg2: Int) fun addr (reg1: Int, reg2: Int) fun sub (reg1: Int, reg2: Int) fun shr (reg1: Int) fun subb (reg1: Int, reg2: Int) fun shl (reg1: Int) fun jneqr (reg1: Int, reg2: Int) fun seti (value: Int) fun jmpv0 (address: Int) fun rand (reg: Int, value: Int) fun draw (reg1: Int, reg2: Int, value: Int) fun jkey (reg: Int) fun jnkey (reg: Int) fun getdelay (reg: Int) fun waitkey (reg: Int) fun setdelay (reg: Int) fun setsound (reg: Int) fun addi (reg: Int) fun spritei (reg: Int) fun bcd (reg: Int) fun push (reg: Int) fun pop (reg: Int) } |
The first two methods are special: before
is called before the opcode is decoded, passing in the opcode (as an Int, only the lower 16-bit are used) and the address of the opcode in VmState.ram
. unknown
is called when the decoder encounters an opcode it does not know.
The rest of the methods map to the opcodes in the table above. The decode
function parses the opcode and passes it to the Decoder
(main.kt
):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
fun decode(decoder: Decoder, address:Int, msb: Byte, lsb: Byte) { val opCode = (msb.toInt() shl 8 or lsb.toInt().and(0xff)).and(0xffff) decoder.before(opCode, address) when (msb.high()) { 0x0 -> { when (msb.toInt() shl 8 or lsb.toInt()) { 0x00e0 -> decoder.clear() 0x00ee -> decoder.ret() else -> decoder.unknown(opCode, address) } } 0x1 -> decoder.jmp(address(msb, lsb)) 0x2 -> decoder.call(address(msb, lsb)) 0x3 -> decoder.jeq(msb.low(), lsb.toInt()) 0x4 -> decoder.jneq(msb.low(), lsb.toInt()) 0x5 -> decoder.jeqr(msb.low(), lsb.high()) 0x6 -> decoder.set(msb.low(), lsb.toInt()) 0x7 -> decoder.add(msb.low(), lsb.toInt()) 0x8 -> { val reg1 = msb.low() val reg2 = lsb.high() when(lsb.low()) { 0x0 -> decoder.setr(reg1, reg2) 0x1 -> decoder.or(reg1, reg2) 0x2 -> decoder.and(reg1, reg2) 0x3 -> decoder.xor(reg1, reg2) 0x4 -> decoder.addr(reg1, reg2) 0x5 -> decoder.sub(reg1, reg2) 0x6 -> decoder.shr(reg1) 0x7 -> decoder.subb(reg1, reg2) 0xe -> decoder.shl(reg1) else -> decoder.unknown(opCode, address) } } 0x9 -> { val reg1 = msb.low() val reg2 = lsb.high() decoder.jneqr(reg1, reg2) } 0xa -> decoder.seti(address(msb, lsb)) 0xb -> decoder.jmpv0(address(msb, lsb)) 0xc -> decoder.rand(msb.low(), lsb.toInt()) 0xd -> decoder.draw(msb.low(), lsb.high(), lsb.low()) 0xe -> { when(lsb.toInt() or 0xff) { 0x9e -> decoder.jkey(msb.low()) 0xa1 -> decoder.jnkey(msb.low()) else -> decoder.unknown(opCode, address) } } 0xf -> { val reg = msb.low() when(lsb.toInt() or 0xff) { 0x07 -> decoder.getdelay(reg) 0x0a -> decoder.waitkey(reg) 0x15 -> decoder.setdelay(reg) 0x18 -> decoder.setsound(reg) 0x1e -> decoder.addi(reg) 0x29 -> decoder.spritei(reg) 0x33 -> decoder.bcd(reg) 0x55 -> decoder.push(reg) 0x65 -> decoder.pop(reg) else -> decoder.unknown(opCode, address) } } else -> decoder.unknown(opCode, address) } } |
We can use Kotlin's when expression to perform a switch on the opcode. Notice the use of our extension methods on Int
and Byte
to easily access the information we want to pass on to the Decoder
.
Kotlin note: can this be done better?
A simple Disassembler
Writing our disassembler is easy: we simply the Decoder
trait and then loop over all opcodes in the VmState.ram
. Here's the Disassembler
(Disassembler.kt
):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
package chip8; class Disassembler(): Decoder { val builder = StringBuilder() override fun before(opCode: Int, address: Int) { builder.append("addr: 0x${address.toHex()}, op: 0x${opCode.toHex()}, ") } override fun unknown(opCode: Int, address: Int) { builder.append("Unknown opcode addr: 0x${address.toHex()}, op: 0x${opCode.toHex()}") } override fun clear() { builder.line("clear") } override fun ret() { builder.line("ret") } override fun jmp(address: Int) { builder.line("jmp 0x${address.toHex()}") } override fun call(address: Int) { builder.line("call 0x${address.toHex()}") } override fun jeq(reg: Int, value: Int) { builder.line("jeq v${reg.toHex()}, 0x${value.toHex()}") } override fun jneq(reg: Int, value: Int) { builder.line("jneq v${reg.toHex()}, 0x${value.toHex()}") } override fun jeqr(reg1: Int, reg2: Int) { builder.line("jeqr v${reg1.toHex()}, v${reg2.toHex()}") } override fun set(reg: Int, value: Int) { builder.line("set v${reg.toHex()}, 0x${value.toHex()}") } override fun add(reg: Int, value: Int) { builder.line("add v${reg.toHex()}, 0x${value.toHex()}") } override fun setr(reg1: Int, reg2: Int) { builder.line("setr v${reg1.toHex()}, v${reg2.toHex()}") } override fun or(reg1: Int, reg2: Int) { builder.line("or v${reg1.toHex()}, v${reg2.toHex()}") } override fun and(reg1: Int, reg2: Int) { builder.line("and v${reg1.toHex()}, v${reg2.toHex()}") } override fun xor(reg1: Int, reg2: Int) { builder.line("xor v${reg1.toHex()}, v${reg2.toHex()}") } override fun addr(reg1: Int, reg2: Int) { builder.line("addr v${reg1.toHex()}, v${reg2.toHex()}") } override fun sub(reg1: Int, reg2: Int) { builder.line("sub v${reg1.toHex()}, v${reg2.toHex()}") } override fun shr(reg1: Int) { builder.line("shr v${reg1.toHex()}") } override fun subb(reg1: Int, reg2: Int) { builder.line("subb v${reg1.toHex()}, v${reg2.toHex()}") } override fun shl(reg1: Int) { builder.line("shl v${reg1.toHex()}") } override fun jneqr(reg1: Int, reg2: Int) { builder.line("jneqr v${reg1.toHex()}, v${reg2.toHex()}") } override fun seti(value: Int) { builder.line("seti 0x${value.toHex()}") } override fun jmpv0(address: Int) { builder.line("jmpv0 0x${address.toHex()}") } override fun rand(reg: Int, value: Int) { builder.line("rand v${reg.toHex()}, 0x${value.toHex()}") } override fun draw(reg1: Int, reg2: Int, value: Int) { builder.line("draw v${reg1.toHex()}, v${reg2.toHex()}, 0x${value.toHex()}") } override fun jkey(reg: Int) { builder.line("jkey v${reg.toHex()}") } override fun jnkey(reg: Int) { builder.line("jnkey v${reg.toHex()}") } override fun getdelay(reg: Int) { builder.line("getdelay v${reg.toHex()}") } override fun waitkey(reg: Int) { builder.line("waitkey v${reg.toHex()}") } override fun setdelay(reg: Int) { builder.line("setdelay v${reg.toHex()}") } override fun setsound(reg: Int) { builder.line("setsound v${reg.toHex()}") } override fun addi(reg: Int) { builder.line("addi v${reg.toHex()}") } override fun spritei(reg: Int) { builder.line("spritei v${reg.toHex()}") } override fun bcd(reg: Int) { builder.line("bcd v${reg.toHex()}") } override fun push(reg: Int) { builder.line("push v0-v${reg.toHex()}") } override fun pop(reg: Int) { builder.line("pop v0-v${reg.toHex()}") } override fun toString(): String { return builder.toString() } } |
We append the textual translation of the opcode to a StringBuilder
. Once we decoded all opcodes, we can call code>Disassembler#toString to get the textual representation of the program. Notice the use of Kotlin’s string templating.
I added an extension method for StringBuilder
to our Extensions.kt
file to make appending new lines easier (Extensions.kt
):
1 2 3 |
fun StringBuilder.line(line: String) { this.append(line); this.append("\n") } |
The only thing left to do is to loop through the program’s code and let the Decoder
do its thing. But there’s one problem: when we read the ROM into a VmState
, we lose the information about the programs size. If we simply iterated over all opcodes in VmState.ram
starting at 0x200
, we’d output a lot of garbage. Let’s fix this by adding the program size to the VmState
and setting it in loadRom
:
(VmState.kt
)
1 2 3 4 5 6 7 8 9 10 11 12 |
package chip8; data class VmState (val ram: ByteArray = ByteArray(4096), val registers: ByteArray = ByteArray(16), var pc: Int = 0x200, var index: Int = 0, var sp: Int = 0, val stack: IntArray = IntArray(16), var delay: Int = 0, var sound: Int = 0, val keys: ByteArray = ByteArray(16), val programSize: Int) |
(main.kt
)
1 2 3 4 5 6 7 8 |
fun loadRom(file: String): VmState { return DataInputStream(BufferedInputStream(FileInputStream(file))).use { val rom = it.readBytes() val state = VmState(programSize=rom.size) System.arraycopy(rom, 0, state.ram, state.pc, rom.size) state } } |
Note the use of Kotlin’s named arguments feature in line 4 of loadRom
.
Now we are ready to write our disassemble
function. I decided to put it in Disassembler.kt
as a top-level function, much like our loadRom
function in main.kt
. We could of course just add it to Disassembler
as well.
(Disassembler.kt
)
1 2 3 4 5 6 7 8 9 |
fun disassemble(vmState: VmState): String { val decoder = Disassembler() for(addr in 0x200..(0x200+vmState.programSize - 1) step 2) { val msb = vmState.ram[addr] val lsb = vmState.ram[addr + 1] decode(decoder, addr, msb, lsb) } return decoder.toString() } |
And for good measure, let’s try our disassembler in main
(main.kt
):
1 2 3 4 |
fun main(args: Array<String>) { val vmState = loadRom("roms/maze.rom") println(disassemble(vmState)) } |
Which gives us the following output for maze.rom
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
addr: 0x200, op: 0x6000, set v0, 0x0 addr: 0x202, op: 0x6100, set v1, 0x0 addr: 0x204, op: 0xa222, seti 0x222 addr: 0x206, op: 0xc201, rand v2, 0x1 addr: 0x208, op: 0x3201, jeq v2, 0x1 addr: 0x20a, op: 0xa21e, seti 0x21e addr: 0x20c, op: 0xd014, draw v0, v1, 0x4 addr: 0x20e, op: 0x7004, add v0, 0x4 addr: 0x210, op: 0x3040, jeq v0, 0x40 addr: 0x212, op: 0x1204, jmp 0x204 addr: 0x214, op: 0x6000, set v0, 0x0 addr: 0x216, op: 0x7104, add v1, 0x4 addr: 0x218, op: 0x3120, jeq v1, 0x20 addr: 0x21a, op: 0x1204, jmp 0x204 addr: 0x21c, op: 0x121c, jmp 0x21c addr: 0x21e, op: 0x8040, setr v0, v4 addr: 0x220, op: 0x2010, call 0x10 addr: 0x222, op: 0x2040, call 0x40 addr: 0x224, op: 0x8010, setr v0, v1 |
Thankfully we also have the source code of maze.rom
, so we can compare the output of the disassembler to it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
LD V0, 0 LD V1, 0 LOOP: LD I, LEFT ; We draw a left line by default, as the random number ; is 0 or 1. If we suppose that it will be 1, we keep ; drawing the left line. If it is 0, we change register ; I to draw a right line. RND V2, 1 ; Load in V2 a 0...1 random number SE V2, 1 ; It is 1 ? If yes, I still refers to the left line ; bitmap. LD I, RIGHT ; If not, we change I to make it refer the right line ; bitmap. DRW V0, V1, 4 ; And we draw the bitmap at V0, V1. ADD V0, 4 ; The next bitmap is 4 pixels right. So we update ; V0 to do so. SE V0, 64 ; If V0==64, we finished drawing a complete line, so we ; skip the jump to LOOP, as we have to update V1 too. JP LOOP ; We did not draw a complete line ? So we continue ! LD V0, 0 ; The first bitmap of each line is located 0, V1. ADD V1, 4 ; We update V1. The next line is located 4 pixels doan. SE V1, 32 ; Have we drawn all the lines ? If yes, V1==32. JP LOOP ; No ? So we continue ! FIN: JP FIN ; Infinite loop... RIGHT: ; 4*4 bitmap of the left line DB $1....... DB $.1...... DB $..1..... DB $...1.... LEFT: ; 4*4 bitmap of the right line ; And YES, it is like that... DB $..1..... DB $.1...... DB $1....... DB $...1.... |
That seems to match up pretty nicely, except for the opcodes starting at address 0x21e
. In the original source, those bytes are actually data (sprites to be exact). Our disassembler doesn’t know that those bytes aren’t instructions, so it tries its best to decode them.
This is a problem we’ll have to resolve next time. It turns out that when writing an interpreter, we’ll never really encounter those bytes as instructions. Due to jumps in the code, the interpreter will never reach those bytes. Pretty simple fix.
However, if we wanted to write an ahead-of-time compiler that translates the Chip8 instructions to Java bytecode or machine code, this is going to be a problem. For that we’ll need to figure out the basic blocks of our program. And even then, we can’t be 100% sure, as Chip8 has non-absolute jump instructions based on register contents.
Up Next
We can now load a ROM and disassemble it (naively). Next time we are going to refactor things a little to be more idiomatic.
Happy coding!
Previously
Let’s write a Chip8 emulator in Kotlin [Part 0: Motivation & Setup]
Following Along
- Install the required tools (JDK, IDEA, IDEA Kotlin plugin, Git)
- Clone the repo:
git clone https://github.com/badlogic/chip8.git
- Checkout the tag for the article you want to work with:
git checkout part1
- Import the project into IDEA (
Open Project, select build.gradle file, select "Use customizable gradle wrapper"
You may want to turn the decode() function to an extension to Decoder. Thus you won’t need to repeat “decoder.” in front of every method call.
You can use expression bodies for functions in the disassembler, it will look like
override fun setsound(reg: Int) = builder.line(“setsound v${reg.toHex()}”)
Since StringBuilder.append() returns StringBuilder (and not Unit), you may need to define the following extension:
fun Any?.unit() {}
and use it as follows
override fun before(opCode: Int, address: Int) = builder.append(“addr: 0x${address.toHex()}, op: 0x${opCode.toHex()}, “).unit()
This will save quite a few lines occupied by {}
lol, I chose maze rom as a test bunny too, before reading all of this. Was it the smallest?
I’d recommend extension properties instead of extension methods for toHex and high/low:
val Byte.i: Int get() = this.toInt()
val Byte.lo: Int get() = this.i and 0xF
val Byte.hi: Int get() = (this.i and 0xF0) shr 4
val Byte.hex: String get() = Integer.toHexString(this.i)
val Int.hex: String get() = Integer.toHexString(this)
I’d recommend extension properties instead of extension methods for toHex and high/low:
val Int.hex: String get() = Integer.toHexString(this)
val Byte.i: Int get() = this.toInt()
val Byte.lo: Int get() = this.i and 0xF
val Byte.hi: Int get() = (this.i and 0xF0) shr 4
val Byte.hex: String get() = this.i.hex
Wow, thanks for the suggestions! Is there a way to limit the scope of Any?.unit()? To my understanding, that would be applied to everything, and while not detrimental, i’d rather not pollute global ‘scope’.
Great suggestion, thanks!
You can put it inside the decompiler class
dafuq is dat shit? I want fancy 3D stuff 😛
hey guyz , Is android game programming would be there after 6-7 yrs ? or it would become more complicated that one person can’t do .
hi, I am Ender which is from China, thx man. I just use your game platform to develop a game, it is first time for me. you do a great job. keep going!