1 unstable release
0.1.0 | Dec 18, 2019 |
---|
#78 in #portable
1MB
961 lines
Contains (WOFF font, 190KB) doc/FiraSans-Medium.woff, (WOFF font, 185KB) doc/FiraSans-Regular.woff, (WOFF font, 94KB) doc/SourceSerifPro-Bold.ttf.woff, (WOFF font, 89KB) doc/SourceSerifPro-Regular.ttf.woff, (WOFF font, 56KB) doc/SourceCodePro-Regular.woff, (WOFF font, 56KB) doc/SourceCodePro-Semibold.woff and 1 more.
lasm
A tiny and portable assembly language for complex compilers
Installation
cargo install -f lasm
Documentation
Docs can be found here.
lib.rs
:
lasm, a minimal and portable assembly langauge
The spirit of this crate is to make the most small and correct assembly language as possible. A reduced instruction set is valued above all else. If possible, speed is also an admirable trait.
purpose
Writing a compiler is very very hard. A lot of that difficulty comes from trying to manage memory and trying to represent high level concepts in terms of low level instructions.
So, with these problems in mind, I wrote this assembly language.
features
The most high level feature is the infinite number of registers. This allows the compiler to declare and use variables significantly easier. The last time I wrote a compiler, the absolute hardest part was managing when variables were allocated and freed. As a result, I wrote this assembly language to take care of that!
procedures
Another high level feature is managing procedure declarations. When the assembly is parsed, the procedures are each defined before they are checked for semantic errors. So, procedures can be defined in any order.
portability
The final, and best feature is portability. lasm is extremely compact: the entire C implementation of lasm's instruction set is nearly 150 lines. Writing an implementation for lasm is extremely simple, and so compiling to lasm allows the compiler to target several different programming languages and platforms.
basic instructions
Stack Instruction | Description |
---|---|
push LITERAL |
Push the LITERAL argument onto the stack. The LITERAL argument MUST be a character or float |
pop |
Pop a value off of the stack and into the ACC register |
ld REGISTER |
Push the value stored in REGISTER onto the stack. The REGISTER being loaded MUST be defined before being loaded |
st REGISTER |
Pop a value off of the stack into REGISTER. The REGISTER being stored to MUST be declared before being stored |
dup |
Duplicate the top item on the stack |
Pointer Instruction | Description |
---|---|
refer REGISTER |
Push a pointer to REGISTER onto the stack |
deref_ld |
Pop a pointer off of the stack, and push the value stored at where the pointer points. This will only push a single cell onto the stack, not more than one cell |
deref_st |
Pop a pointer and a cell off of the stack, and store the cell at the pointer |
alloc REGISTER |
Pop a SIZE value off of the stack, and store the address to SIZE free cells in REGISTER |
free REGISTER |
Pop a SIZE value off of the stack, and free the memory stored at the pointer stored in REGISTER |
Math Instruction | Description |
---|---|
add |
Pop two cells off of the stack, and push their sum |
sub |
Pop two cells off of the stack, and push the first minus the second |
div |
Pop two cells off of the stack, and push their product |
mul |
Pop two cells off of the stack, and push the first divided by the second |
cmp |
Pop two cells off of the stack, and push -1 if the first is less than the second, 0 if they are equal, and 1 otherwise |
IO Instruction | Description |
---|---|
outc |
Pop a cell off of the stack and print it as a character |
outn |
Pop a cell off of the stack and print it as a float |
inc |
Get a character from STDIN and push it into the stack |
inn |
Get a float from STDIN and push it into the stack |
Control Instruction | Description |
---|---|
loop |
Marks the start of a loop. At the start of each iteration, a test value is popped from the stack. While the value is not zero, the loop continues. Else, the loop jumps to the matching endloop |
endloop |
Marks the end of a loop |
examples
This assembly language is a bit simpler than most others because portability and compactness are the two largest goals in mind. As a result, examples are pretty simple.
fibonacci
This simply implements fibonacci by doing arithmetic on three variables a
, b
, and c
.
To simplify outputing the numbers, a few helper procedures are defined.
// comments are C-style
// The `stack_size` flag can ONLY be used at the top of the file.
// Anywhere else, this flag will show up as a syntax error.
// The purpose of the flag is to set the size of memory used
// outside of the statically determined memory. Any loads,
// pushes, allocs, etc. require a bit of memory on the stack.
// If this flag is not present, 256 cells are used by default.
stack_size 1024
// The start procedure is the entry point
proc start
// Declare the registers we will use
define a, 1
// Push 0 and store it in 'a'
push 0 st a
define b, 1
// Push 1 and store it in 'b'
push 1 st b
define c, 1
// Push 0 and store it in 'c'
push 0 st c
// This will determine the number of times to iterate
define n, 1
// Push 10 and store it in 'n'
push 10 st n
// loop while n is not zero
ld n
loop
ld a st c // c = a
ld b st a // a = b
ld a call print_num // print a
ld c ld b add st b // b = c + b
push 1
ld n
// subtract 1 from n
sub
// store the result in n again
st n
// Load n again for the loop test
ld n
endloop
endproc
proc print_num
// the define keyword takes two arguments,
// the name of the register and the size of
// the newly created register.
// This simply tells the assembler to allocate permanent
// space for a register with a given size. It also tells
// the assembler how many cells to pop off of the stack when
// storing a value in this register.
define n, 1
// When we call print_num, we expect a single argument on the
// stack. So, we store this argument in the register n for later
// usage.
st n
// Now we load the value stored in n back onto the stack
// and print the value as a number
ld n outn
// Now we print a newline using the newline procedure
call nl
endproc
proc nl
// 10 (the character code for '\n') is pushed onto the stack
// and printed out as a character
push 10 outc
endproc
implementation
lasm's implementation is very simple: there are very few instructions to implement when targeting a new programming language. Additionally, lasm's structure is very simple to implement in low level languages.
There are a few very important notes for lasm's implementation
- lasm's memory is implemented using an array of double precision floats, or 64 bit floats
- lasm tracks allocs and frees for each individual cell of the memory array. This is most simply done using an array of booleans with identical length to the data tape
- allocating more than the available amount of memory is undefined behavior (if possible, this should cause the program to exit)
- the implementation should always mark memory reserved for registers as allocated (so that alloc may not return a pointer to register memory)
- memory reserved for registers always lies immediately before the stack
- the accumulator register always lies at address
0
- the stack pointer register always lies at address
1
- user defined registers lie between the stack pointer register and the stack
- the
inn
andinc
instructions return0
onEOF
and on other input errors
Dependencies
~2MB
~29K SLoC