Object Representation
I use two types (vm_data type and vm_obj struct).
The vm_data type contains primitive types, constants, and references to objects. Since conservative GC seems difficult, I decided to include tags within one word. I determine by the lower bits. I simply typedef intptr_t vm_data;
Integer ....00 Upper 30 bits represent signed integer
Character .00000 Upper 8 bits represent
true .00101
false .01001
nil .01101
undefined .10101
Box ....10 Reference to vm_data
vm_obj ....11 Reference to vm_obj
The vm_obj struct contains various other objects. Like this, I represent strings, symbols, closures, pairs, and stack objects.
struct vm_obj
{
unsigned char tag;
union{
char *string;
char *symbol;
vm_data *closure;
struct {
vm_data *p;
int size;
} stack;
struct {
vm_data *car;
vm_data *cdr;
} pair;
} u;
};
Execution
For the compiler, I use the same one as before. For now, I’m passing via pipe. compile.scm is a scheme->bytecode compiler written in scheme, vm is a VM written in C. It’s awkward, but I execute it like this. There’s no repl yet.
echo '(+ 1 2 3)' | ./compile.scm | ./vm
Or,
cat source.scm | ./compile.scm | ./vm
Call Frame
Since we need to handle variable arguments, I had to either push the argument size onto the stack or add another register, and I chose the latter. Now we can handle variable arguments. The reason I’m pushing an end-of-frame tag at the bottom is because I thought I’d need to check the return address when evacuating the stack to the heap.
| Argument 1 | < argp
| : |
| Argument n |
| Previous frame ptr | < frame pointer
| Previous arg ptr |
| Previous closure |
| Return address |
| TAG: end-of-frame |
|-----------------------|
Built-in Functions
We need to decide how much to expand built-in functions at compile time. Currently, for functions with only one argument, I execute instructions directly on the accumulator, and for two or more arguments, I follow the same procedure as normal function calls (push all arguments onto the stack, create closure object).
Finally, I’ll show an example of code that works now. From now on, I want to implement some cool GC.
(letrec ([reverse (lambda (in out)
(if (null? in)
out
(reverse (cdr in) (cons (car in) out))))]
[makelist (lambda (x)
(if (= x 0)
(cons 0 '())
(cons x (makelist (- x 1)))))])
(reverse (quote (1 #t a b c (1 2 "helloworld"))) ()))