5 releases (breaking)
0.6.0 | Jan 8, 2025 |
---|---|
0.5.0 | Jan 8, 2025 |
0.4.0 | Jan 8, 2025 |
0.3.0 | Jan 6, 2025 |
0.2.0 | Jan 6, 2025 |
#23 in Caching
587 downloads per month
165KB
3K
SLoC
Hecate Virtual Machine
This project provides an implementation of the Hecate virtual machine.
- Where does the name come from?
- ATTENTION
- Features
- Getting Started
- Usage
- Performance Tracking
- Registers
- Bytecode Instruction Set
- Caches and Memory Access
- Sources
Where does the name come from?
The project is named Hecate, inspired by the ancient Greek goddess associated with magic, crossroads, and guiding transformations. Just as Hecate stood at the intersection of possibilities, this CPU project represents the convergence of computational logic, optimization, and problem-solving, providing participants with a structured yet flexible virtual machine to explore and optimize their solutions.
The name reflects both the mystical allure of programming challenges and the power of guiding complex tasks to completion.
ATTENTION
This project is currently unstable. Changes are to be expected. Use at your own risk!
Features
-
Bytecode Execution:
- Supports a well-defined set of instructions for computation, memory operations, and control flow.
- Instructions include:
- Arithmetic operations (Add, Sub, Mul, Div)
- Memory access (LoadValue, LoadMemory, Store)
- Stack manipulation (PushValue, PushReg, Pop)
- Control flow (Call, Ret, Jmp)
- Debugging (Inspect)
- Halting (Halt)
-
Performance Metrics:
- Tracks the number of cycles executed.
- Measures memory access performance using a multi-level cache scoring system (L1, L2, L3).
-
Debugging and Verbose Mode:
- NOT YET IMPLEMENTED Provides step-by-step debugging modes (StepOver, StepInto, StepOut) with support for code and data breakpoints.
- Outputs detailed logs of executed instructions in verbose mode.
-
Memory Model:
- Flat memory structure with stack management.
- Supports configurable memory size and register count.
Getting Started
Prerequisites
- Rust (for building the project)
Usage
-
Clone the repository: git clone cd
-
Run the demo project: cargo run --release -- run-asm demo.hasm
You can also use the provided assembler to produce bytecode by running cargo run -p hasm <input> <output>
Performance Tracking
The CPU tracks performance using the following stats:
Stat | Description |
---|---|
Cycles | Cumulative score based on average number of cycles[^1]. This includes cycle count for all executed instructions as well as for every memory access. |
Memory Access Count | The total amount memory is requested from an address. This includes values that are already cached. |
Cache hits (L1D, L1I, L2, L3) | How often each cache was hit. |
Registers
When used as a library, the CPU can be instantiated with as many registers as desired.
When using the pre-made binary, the CPU has 31 general purpose registers. R1-R31. There is als a register called R0 which ignores writes and will always return 0.
Bytecode Instruction Set
For simplicity, this implementation operates on 32 bit unsigned integers. This might be changed later, but for now this should be enough.
Important: the cycle count mentioned here does not include memory access and is just for the instruction itself. For memory access cycle counts see the section below
Opcode | Mnemonic | Description | Cycles |
---|---|---|---|
0x00 | Nop | No operation | 1 |
0x01 | LoadValue | Load immediate value into a register | 1 |
0x02 | LoadMemory | Load value from memory into a register | 1 |
0x03 | LoadReg | Load value from one register into another register | 1 |
0x04 | LoadFromRegMemory | Load value from memory into a register, where the address is read from a register | 1 |
0x05 | Store | Store a register's value into memory | 1 |
0x06 | StoreValue | Store an immediate value into memory | 1 |
0x07 | StoreAtReg | Store a register's value into memory, where the address is read from a register | 1 |
0x08 | StoreValueAtReg | Store an immediate value into memory, where the address is read from a register | 1 |
0x09 | PushValue | Push immediate value onto the stack | 1 |
0x0a | PushReg | Push register value onto the stack | 1 |
0x0b | Pop | Pop a value from the stack into a register | 1 |
0x11 | Add | Add two register values | 1 |
0x12 | AddValue | Add an immediate value to a register | 1 |
0x13 | FAdd | Add two floating-point register values | 2 |
0x14 | FAddValue | Add an immediate floating-point value to a register | 2 |
0x15 | Sub | Subtract one register value from another | 1 |
0x16 | SubValue | Subtract an immediate value from a register | 1 |
0x17 | FSub | Subtract one floating-point register value from another | 2 |
0x18 | FSubValue | Subtract an immediate floating-point value from a register | 2 |
0x19 | Mul | Multiply two register values | 4 |
0x1A | MulValue | Multiply a register value by an immediate value | 4 |
0x1B | FMul | Multiply two floating-point register values | 4 |
0x1C | FMulValue | Multiply a register value by an immediate floating-point value | 4 |
0x1D | Div | Divide one register value by another | 27 |
0x1E | DivValue | Divide a register value by an immediate value | 27 |
0x1F | FDiv | Divide one floating-point register value by another | 27 |
0x20 | FDivValue | Divide a register's floating-point value by an immediate value | 27 |
0x70 | And | Binary AND two registers | 1 |
0x71 | AndValue | Binary AND a register and an immediate value | 1 |
0x72 | Or | Binary OR two registers | 1 |
0x73 | OrValue | Binary OR a register and an immediate value | 1 |
0x74 | Xor | Binary XOR two registers | 1 |
0x75 | XorValue | Binary XOR a register and an immediate value | 1 |
0x76 | Not | Binary NOT a register | 1 |
0x77 | ShiftLeft | Shifts the value in the first register to the left by the value in the second register | 1 |
0x78 | ShiftLeftValue | Shifts the value in the first register to the left by an immediate value | 1 |
0x79 | ShiftRight | Shifts the value in the first register to the right by the value in the second register | 1 |
0x7a | ShiftRightValue | Shifts the value in the first register to the right by an immediate value | 1 |
0xB0 | LoadByte | Load a byte from memory into a register | 2 |
0xB1 | StoreByte | Store the least significant byte of a register into memory | 2 |
0xC00 | Jmp | Unconditional jump to a specific address | 1 |
0xC01 | Cmp | Compare two register values | 1 |
0xC02 | CmpValue | Compare a register value with an immediate value | 1 |
0xC03 | FCmp | Compare two floating-point register values | 2 |
0xC04 | FCmpValue | Compare a register's floating-point value with an immediate value | 2 |
0xC05 | Je | Jump if equal (Zero Flag is set) | 2 |
0xC06 | Jne | Jump if not equal (Zero Flag is not set) | 2 |
0xC07 | Jg | Jump if greater (Zero Flag not set and Sign Flag equals Overflow Flag) | 2 |
0xC08 | Jge | Jump if greater or equal (Sign Flag equals Overflow Flag) | 2 |
0xC09 | Jl | Jump if less (Sign Flag not equal to Overflow Flag) | 2 |
0xC0A | Jle | Jump if less or equal (Zero Flag is set or Sign Flag not equal to Overflow Flag) | 2 |
0xC0B | Ja | Jump if above (Carry Flag not set and Zero Flag not set) | 2 |
0xC0C | Jae | Jump if above or equal (Carry Flag not set) | 2 |
0xC0D | Jb | Jump if below (Carry Flag is set) | 2 |
0xC0E | Jbe | Jump if below or equal (Carry Flag is set or Zero Flag is set) | 2 |
0xC0F | Jc | Jump if carry (Carry Flag is set) | 2 |
0xC10 | Jnc | Jump if no carry (Carry Flag is not set) | 2 |
0xC11 | Jo | Jump if overflow (Overflow Flag is set) | 2 |
0xC12 | Jno | Jump if no overflow (Overflow Flag is not set) | 2 |
0xC13 | Js | Jump if sign (Sign Flag is set) | 2 |
0xC14 | Jns | Jump if no sign (Sign Flag is not set) | 2 |
0xCFF | Jxcz | Jump if CX register is zero (ignores flags) | 2 |
0xF0 | Call | Call a function by pushing return address and jumping | 25 |
0xF1 | Ret | Return from a function by popping the return address | 5 |
0xF2 | Syscall | Perform a system call (currently every syscall is a no-op) | Variable (Depends on HostIO) |
0xFFFFFFF0 | Inspect | Output the value of a memory address (special instruction for debugging, therefore: no cycle cost) | 0 |
0xFFFFFFFF | Halt | Halt the program execution | 0 |
Caches and Memory Access
Caches
Caching in this VM is simulated. This means the VM never actually copies values into a cache, it just remembers which memory locations are in which cache.
The VM uses hierarchical caching (L1, L2, L3) and has separate L1 caches for data (L1D) and instructions (L2D).
Memory Access
Write Access
In real hardware the CPU would write to cache and the hardware would eventually write back that value to main memory. As this process does not stall the CPU (AFAIK) the VM just assumes a cycle cost of 1.
Read Access
- L1 Cache Hit: 3 Cycles
- L2 Cache Hit: 11 Cycles
- L3 Cache Hit: 50 Cycles
- Main Memory: 125 Cycles
Prefetching and Branch Prediction
There is currently a simple prefetcher which fetches 2 cache lines of data on access and 10 lines of instructions. There is currently no branch predictor. Once there is, it will affect how prefetching works.
Sources
[^1]: IT Hare: Infographics: Operation Costs in CPU Clock Cycles
Dependencies
~14–25MB
~444K SLoC