LOLCODE is most suited to be interpreted, due to several of its key features (dynamic casting, string manipulation, etc).
Variables in LOLCODE are dynamic. They can change from one type to another at any point in time during execution. So, a few concessions have to be made to determine how to store variables in the 6502's memory.
Each variable in memory, is 3-bytes (24-bits). The first byte stores the variable type, and depending on this, the following two bytes are read differently.
Value | Type | Description |
---|---|---|
0x00 | Undefined | Variable has no assigned type, and cannot be used until initialized to some value. |
0x01 | Integer | The second and third bytes are read as a 16-bit integer. |
0x02 | Float | The second byte is the mantissa, and the third byte is the exponent. |
0x03 | Boolean | The next two bytes store data like an integer, but the variable is true if non-zero, and false if zero. |
0x04 | String | The next two bytes hold the address of where the String is stored in the heap (or memory). |
Strings are special, as since their length is variable we need a special way to represent them. A String of size n in memory is represented by a record of size n + 3.
The first byte is an 8-bit integer to represent the size of the String. This also means that the maximize size for any String is 255 characters.
The next N bytes are the actual data of the String.
The N + 1, and N + 2 bytes have special meaning. If they are both 0x00, then it signifies String termination. However, if they are non-zero, this means String continuation. That is the String has additional data in another String record stored in memory. The location of the other String is stored in these bytes, and the program jumps to it accordingly.
String to Integer Conversion
Suppose we have a string with the data "2468". This string is represented internally as:
Byte | $00 | $01 | $02 | $03 | $04 | $05 | $06 |
---|---|---|---|---|---|---|---|
Value | $04 | $32 | $34 | $36 | $38 | $00 | $00 |
Consider the following decimal values in binary and hex:
Decimal | Hex | Binary |
---|---|---|
1 | $01 | 0001 |
10 | $0A | 1010 |
100 | $64 | 0110 0100 |
1000 | $03E8 | 0000 0011 1110 1000 |
10000 | $2710 | 0010 0111 0001 0000 |
Stack Machine
In short, we are simulating a stack-based machine on the 6502 to provide the necessary capability to interpret LOLCODE. But rather than running a VM on the 6502, the compiler converts the IR-tree into 'macro instructions' which themselves are composed of 6502 instructions.
Some key things we will need are:
- Stack Pointer (SP) - memory address of the top of the stack.
- Frame Pointer (FP) - memory address of the beginning of the current frame.
- Heap Pointer (HP) - memory address of the top of the heap.
All LOLCODE operations are performed on the stack. LOAD/STORE, ADD/SUB, CALL, you name it, it gets done on there. 6502-wise, macro instructions are used which encompass the manipulation of the X, Y, A registers, as well as the limited memory addressing (indirect for fun).