Vighnesh Iyer, Bora Nikolic
Safin Singh, Ansh Maroo, Connor Chang, Pramath Krishna, Vighnesh Iyer, Joonho Whangbo
How can we run RTL simulation with fast startup and high throughput?
Don't run the full workload in RTL simulation
Use an instruction set simulator (ISS) and pick samples to run in RTL simulation
The full workload is represented by a selection of sampling units.
A sampling unit is defined by an architectural checkpoint.
The microarchitectural state of the RTL simulation starts at the reset state!
wikisort benchmark from embench, $N = 10000$, $C = 18$, $n_{\text{detailed}} = 2000$
huffbench benchmark from embench, $N = 10000$, $C = 18$, $n_{\text{detailed}} = 2000$
Problem 1: Embench, Coremark, ... are not interesting enough.
Problem 2: No systematic methodology for complete checkpointing and injection.
build.rs
for programmatic code generationalloc
and collections
)no_std
crates that work baremetalHeavily used libraries in the wild + baremetal = good
This is a purely experimental project
struct
#[derive(Serialize)]
pub struct Cpu {
pub regs: [u64; 32],
pub pc: u64,
pub csrs: Csrs
}
#[derive(Serialize)]
pub struct System {
pub cpus: Vec<Cpu>,
bus: Bus
}
Disentangling state and update rules seems obvious, but is not so in spike
ADLs don't need a new language! It's hardware after all.
riscv-rt
: crt.S
, panic!
handlers, interrupt handlersmemory.x
: defines the accessible raw address space + program segment mappingtarget = "riscv64gc-unknown-none-elf"
: that's all it takes to pull in a cross compiler!no_std
dependency with a custom allocatorcargo build
like any other Rust project!riscv-tests/benchmarks
no_std
Cratesno_std
crates on crates.io that are very popular
The missing piece: stimulus!
A path towards representative, high quality baremetal benchmarks
All these components tie into a robust sampled simulation flow.
Our proposal: Combine SimPoint-style representative sampling with SMARTS-style small intervals