Off-Topic Discussion > Programming

designing a bytecode interpteter

(1/5) > >>

on my various radio control projects i have always had a problem with on the fly configuration. the job is mostly about taking data from an input (transmitted data from the radio receiver, sensor data, feedback from motor controllers, imu data, etc), doing some basic transformation, and then mapping it to an output (servo or motor controller, led, telemetry output, etc).

my first attempt was pretty much a state machine, it took its configuration from an eeprom, which could be configured over a serial terminal. this worked wonderfully but at a cost, the system was huge. hardly any flash was available for program code to drive more inputs and outputs and other interfaces. it worked with what i had hooked up at the time, but trying to expand on it was proving fruitless. when i tried adding an imu i just couldn't integrate it because there was no code room left.

so time for a paradigm shift. the mcu im working on has 1k of eeprom. i cant put compiled code there, but i can use that to store code for an interpreted language. so i went about designing an instruction format. what i ended up with is a 4-byte bytecode language.  this lets me run a 256 instruction program. this is not a lot but its enough to do some basic mapping and even complex maths. i thought about dropping the instructions down to 3 bytes (using the extra 2 bits of the instruction to address registers) storing results to one of 4 registers. but then i realized you would need additional instructions to move data into and out of registers, making code size bigger, and it would only give me an extra 85 instructions over what i already had. so i decided to stick with the 4-byte layout.

| opcode | param1 | param2 | param3 |

i figure i need at least 64 instructions so opcodes can be represented with 6 bits. but for my initial run im just using a byte, wasting the 2 bits. parameters can be data or "pointers" to 16-bit memory locations. all memory is in a 16-bit integer format. the pointers need to be at least 7 bits. this addresses 64 16-bit words of ram + 64 i/o locations. like with the instructions there is an extra bit that i dont know what to do with. i will probibly reserve it for future expansion to 128 words (this would use up about an 8th of the 2k of ram on the mcu) and 128 i/o locations (i only need 56 right now, but i may have overlooked something).

the interpreter has no stack, and no cpu registers. the parameters can either contain a pointer, or constant. to make decoding a little bit easier i have created union that lets me look up each of the 3 parameters in a dozen or so formats, most of them look like this

| op | destination pointer | source pointer a |  source pointer b |

this takes both sources, does an operation, like addition, and stores the value to the destination. though i have other formats that like this:

| op | destination pointer | data msb | data lsb |

which puts a 16 bit constant into memory,it looks something like this:

--- Code: ---struct instruction{
 uint8_t opcode;
  uint8_t raw[3];
  struct{ interpPtr dptr; interpPtr sptA; interpPtr sptB; }ptr3;
  struct{ interpPtr dptr; int16_t data; }dptr_data;

--- End code ---

the instruction interpreter is just a big switch statement, with each instruction define and one line of code to handle that instruction. i have about 40 instructions so far. this is kind of a large switch so i might break it up into ranges find the range and then the instruction. i will have to experiment with this later to see what is faster. most instructions increment the fake program counter (with the exception of jumps), this allows the  next instruction to be fetched from the eeprom (another performance limiter which i might need to devise a solution).

the cpu loop is just a while statement that will run so long as the program counter is less than 256, which is the maximum number of instructions that can be stored on the eeprom. at the start of the loop the program counter is set to zero and the loop begins. some instructions can terminate the program loop to end the program.

anyway the test compile revealed that the whole interpreter comes into about 4k, which is much smaller than the 20k+ state machine that i was using before. this leaves more space for code to do other functions (like drive more i2c devices and other interfaces).

will let you know when i write some bytecode programs and see how they run. following that i might try to come up with a compiler to make the whole thing easier to work with.

playing around with the interpreter i decided to expand my pointers to have 7 byte addresses, with the 8th bit slicing it between memory and system+i/o. this now gives me 128 words to work with. i also decided to reserve the upper 64 bytes of the eeprom for nonvolatile storage. this still gives me a 240 instruction program. i had intended to just store data in constants in the instructions, but that wastes space. so what i did was map it to the top 32 bits of the i/o address space, using the same 16-bit word format as the memory area. this gives me the following memory map:

--- Code: --------0x00-----
|--reserved--| - 16 words/32 bytess
|-----i/o----| - 80 words/160 bytes
|---eeprom---| - 32 words/64 bytes
|----ram-----| - 128 words/256 bytes

--- End code ---

the reserved section represents the null pointer, and a number of system memory locations for status and control.

i decided that it might be possible that i would need more i/o than 64 words, so i added another 16 words on top of it, should keep my board happy.

the program may now read/write to the top 64 bytes of the eeprom. if the program needs to store any config data. the eeprom is kind of slow to write to, so this will be used sparingly.

extra ram is good too. i also went a head and set up defines so i can remap the memory map at will, such as if i move to a more spacious mcu.

next job is to add some fixed point instuctions. i used a lot of fixed point maths in my state machine. i kinda want to simplify things and stick to a single format, like s10.5 or s9.6. though s7.8 would probibly be the easiest to implement. fixed point is neccisary to get smooth scaling with small values. id like some math functions like trig stuff and sqrt, but those would require allocating a chunk of flash for the neccisary lookup tables. its doable but expensive (probibly adding another k or two to the size of the interpreter).


finished most of the fixed point instructions. i decided to make it a variable format system with conversion. a nibble of the cpu status flags now controls what format the fixed point instructions use. and most of the fixed point instructions will use that. there are a couple special instructions, one sets the format to any thing between 15:0 (same as int) to 0:15 (-almost 1 to almost 1, though its not possible to get to 1thanks to the absent of a ones place).

the other special instruction can convert any fixed point format to any other fixed point format. this means if you need to change formats, you can convert your numbers over. some of the instructions can operate on int and result in fixed point numbers of the current format. though it just occured to me i dont have an easy way to convert them back to int. but i got some space for a couple more instructions in there.

many of  the fixed point operations will loose data if you get too close to the edges. especially things like multiplies and divides. i cast the operations as int32 and convert it back to int16 after they are done, to prevent myself from left shifting off any bits during the operation. since most of the operations shift it back anyway this should prevent some data loss, especially on the divides. i dont count the sign bit and consider another bit no mans land to prevent overflows, but this doesnt work for multiplies.

the whole thing is adequate because most of the data is usually 10 or 12 bit anyway. servo positions are 10 bit, motor controllers are usually 8 bit speed + 2 bit direction, and input from remote joysticks is also 10 bit. the imu might give me some problems, the gyro is 16 bit,  and the magnetometer is 12 bit, and the pressure sensor is the worst at 19 bit pressure, 16 bit temperature, but i might just compute altitude in machine code and then pipe the altitude value to the i/o. the data to and from the radio will also operate in 16 bit chuncks.

thing i like about the radio is i can use the same i/o range for both transmit and receive. writes would be applied to the send buffer, and the reads will come from the receive buffer, which gives me more ios for other functions. i guess the next job will be to work on my radio protocol, i need to move a structure that is about 2x the maximum frame size, and i also need to have a code transfer mode. to write data to the eeprom. before i do that im going to actually verify that my interpreter can interpret code.

now that i got the language i need a way to convert textual codes to byte codes. sorta like an assembler. but i decided i didnt just want an asmalike, i want to include some high level stuff like operators and some basic control structures (loops, conditionals, functions), and variable handling with qualification. i decided to stick with what i know and write the compiler in lua. its string manipulation facilities will no doubt come in handy.  i tried working on the variable parsing code last night but my brain wasnt working. but i think i hammered out the jist of the way it will work.

you can use the var or romvar statement to create a named reference to a variable (in ram/rom respectively), which may or may not be initialized. this statement does a couple things, first thing it does is create a pointer in the appropriate memory range. i just have to count up addresses as vars are created and count down as they are destroyed (such as at the end of a control structure). if you initialize the variable it creates a ASN instruction with the data in place of the statement. initialized or not it can only be used in any instruction parameter that call for a pointer.

there will also be a con statement, while using the same format as var, will be required to be initialized. the usage of constants will copy into instructions by value and not by reference. so it behaves more like a #define statement as far as the compiler is concerned. both cons and vars are type qualified by the compiler somewhat, though the number of types supported by the language is going to be very small.  the variable (or constant) is stored by the compiler in a table which stores the variable's name, its pointer address, its datatype, its value, whether or not it is initialized. its also stored in the table by named keys so its easy to look up. this should make it easier to assemble instructions.

need to code this up tonight and figure out how to qualify my language's data types in lua. im probibly going to need to move up lua 5.2 to gain access to bitwise operatons, or use a library, and i will probibly also need the pack library so i can import/export binary data. lua for windows should have all this stuff. once if finish writing the variable handling code (the hard part is figuring out how to convert variable format fixed point numbers to hex in lua), its just a matter of some basic parsing and i should be able to assemble basic programs. once that works and is tested i can start adding more features to the programming language.

Phantom Hoover:
In bytecode aren't variables basically just glorified pointers anyway?

in mine, yea pretty much. it really depends on the opcode though. each opcode can take up to 3 parameters:

command..parameter1..parameter2..parameter3 (where the command and each parameter is one byte).

most of the byte codes operate on 3 pointers. in the format p1 <- p2 (some operator) p3,  though there are some that take a pointer and a constant, and others that take only constants. consider:


it doesnt make much sense to allocate memory for the 5. you would need another opcode just to load the 5 into memory. it is easier just to encode the data into a parameter (2 in case of a 16-bit constant) in the format:

command, pointer, datamsb, datalsb

this is also an inline operation since data is retrieved from a pointer, operated on, and then returned to the same pointer, so it looks like:

a = a (operator) b, where a is pointed to by p1 and b is the constant encoded into the bytecode.  this is so i can operate on constants without having an extra instruction to load them into memory. oddly enough this is also the format used by the opcode to load data into memory. so it helps keep code size smaller. machine code for actual cpus take a few instructions to load data into memory, you usually load a constant into a register and them move the contents of the register to ram, and thats usually 2 or 3 (or more) instructions depending on architecture, though the other side of it is that machine instructions are more tightly encoded. my bytecodes are twice the width of the avr instructions, so its not as compact.

most of the commands use all the parameters, but there are a couple that dont take data at all. i considered doing variable length codes, but that would make the code size for the interpreter noticeably bigger, since i would need more complex decode logic. i want the whole thing to come in under less than 6k with the lookup tables for the trig and sqrt functions. its 3k now.


[0] Message Index

[#] Next page

Go to full version