Object Representation

I use two types (vm_data type and vm_obj struct).

The vm_data type contains primitive types, constants, and references to objects. Since conservative GC seems difficult, I decided to include tags within one word. I determine by the lower bits. I simply typedef intptr_t vm_data;

Integer    ....00  Upper 30 bits represent signed integer
Character  .00000  Upper 8 bits represent
true       .00101
false      .01001
nil        .01101
undefined  .10101
Box        ....10  Reference to vm_data
vm_obj     ....11  Reference to vm_obj

The vm_obj struct contains various other objects. Like this, I represent strings, symbols, closures, pairs, and stack objects.

struct vm_obj
{
	unsigned char tag;
	union{
		char *string;
		char *symbol;
		vm_data *closure;
		struct {
			vm_data *p;
			int size;
		} stack;
		struct {
			vm_data *car;
			vm_data *cdr;
		} pair;
	} u;
};

Execution

For the compiler, I use the same one as before. For now, I’m passing via pipe. compile.scm is a scheme->bytecode compiler written in scheme, vm is a VM written in C. It’s awkward, but I execute it like this. There’s no repl yet.

echo '(+ 1 2 3)' | ./compile.scm | ./vm
Or,
cat source.scm | ./compile.scm | ./vm

Call Frame

Since we need to handle variable arguments, I had to either push the argument size onto the stack or add another register, and I chose the latter. Now we can handle variable arguments. The reason I’m pushing an end-of-frame tag at the bottom is because I thought I’d need to check the return address when evacuating the stack to the heap.

|  Argument 1           | < argp
|   :                   |
|  Argument n           |
|  Previous frame ptr   | < frame pointer
|  Previous arg ptr     |
|  Previous closure     |
|  Return address       |
|  TAG: end-of-frame    |
|-----------------------|

Built-in Functions

We need to decide how much to expand built-in functions at compile time. Currently, for functions with only one argument, I execute instructions directly on the accumulator, and for two or more arguments, I follow the same procedure as normal function calls (push all arguments onto the stack, create closure object).

Finally, I’ll show an example of code that works now. From now on, I want to implement some cool GC.

(letrec ([reverse (lambda (in out)                                             
                   (if (null? in) 
                    out 
                    (reverse (cdr in) (cons (car in) out))))]
         [makelist (lambda (x) 
           (if (= x 0)
            (cons 0 '())
            (cons x (makelist (- x 1)))))])
 (reverse (quote (1 #t a b c (1 2 "helloworld"))) ()))