PLARCH Workshop @ ISCA 2023

New Embedded DSLs for Hardware Design and Verification

Vighnesh Iyer, Kevin Laeufer, Young-Jin Park, Rohit Agarwal, Lixiang Yin, Bryan Ngo, Oliver Yu, Koushik Sen, Borivoje Nikolić

UC Berkeley

PLARCH 2023

HDL Implementation Techniques

  • Freestanding DSLs
  • Custom compilers for existing languages
    • "Reflection-based" AST analysis
  • Embedded DSLs (eDSLs)

Freestanding DSLs

  • A custom language specialized for hardware design
  • Examples: Verilog/VHDL, pyrope, Bluespec Verilog, Veryl

Freestanding DSLs

  • Full control over syntax and compiler

 typedef union tagged {
      bit  [4:0] Register;
      bit [21:0] Literal;
      struct {
          bit  [4:0] regAddr;
          bit  [4:0] regIndex;
      } Indexed;
 } InstrOperand;

case (orand: InstrOperand) matches
    tagged Register r : x = rf[r];
    tagged Literal n : x = n;
    tagged Indexed { ra, ri } : x = mem[rf[ra]+ri];
endcase

Ergonomic runtime tagged unions in Bluespec Verilog

Freestanding DSLs


class Packet;
    rand bit [3:0] data [];

    constraint size { data.size() > 5; data.size() < 10; }

    constraint values {
        foreach(data[i]) {
            data[i] == i + 1;
            data[i] inside {[0:8]};
        }
    }
endclass

Ergonomic declarative constrained random API in SystemVerilog

Freestanding DSLs

  • Eventually, the need for general-purpose programming constructs becomes apparent
    • Functions, data structures, iteration, type system, FFI, stdlib

2 directions:

  • Build a metaprogramming layer (e.g. Perl for Verilog)
  • Augment the DSL with more features (e.g. SystemVerilog)

Custom Compilers

  • Take an existing language and its frontend, and design a custom backend
  • Examples: Clash, SystemC HLS, MyHDL*
  • Advantages: Language reuse, direct simulation
  • Disadvantages: Implementation burden, limited to a subset of the language, fine hardware control may be difficult

Embedded DSLs (eDSLs)

  • Embed hardware primitives and operators in a general-purpose language
  • Examples: Lava, Chisel, PyMTL3, Amaranth
  • Leverage existing libraries, build tools, IDEs, testing frameworks, language features
  • Disadvantages: syntax limitations, arbitrary code generators, preserving semantics is hard

eDSL Construction

  • eDSLs provide ADTs and APIs
  • A regular program written in the host language is run to construct a description
  • An interpreter turns the description into some final output

For HDLs, the eDSL primitives are hardware components, and the interpreter turns a netlist description into FIRRTL, CIRCT IR, etc.

Why eDSLs?

HDLs implemented as eDSLs open the door for more eDSLs targeting other aspects of hardware design and verification

We should expand the horizons of eDSLs beyond RTL design into other complementary domains

We present three eDSLs that augment and use Chisel

SimCommand: an eDSL for High-Performance Testbenches in Scala

Testbench APIs in General Purpose Languages

  • Scala: chiseltest
  • Python: cocotb

Both provide all the benefits of being in a general-purpose language, while having fork/join primitives

However, their fork/join functionality is slow

We shouldn't have to compromise on performance

SimCommand

  • Testbench API embedded in Scala
  • Uses chiseltest as the simulator interface
  • Purely functional: testbench description and interpretation are split

def enqueue(data: T): Command[Unit] = for {
    _ <- poke(io.bits, data)
    _ <- poke(io.valid, true.B)
    _ <- waitUntil(io.ready, true.B)
    _ <- step(1)
    _ <- poke(io.valid, false.B)
} yield ()
    

Fork/Join


val pushNPop: Command[Boolean] = for {
    enqThread <- fork(enqueue(100.U))
    deqThread <- fork(dequeue())
    _         <- join(enqThread)
    data      <- join(deqThread)
} yield data.litValue == 100

test(new Queue(UInt(8.W), 4)) { c =>
    val allGood = run(pushNPop, c.clock)
    assert(allGood)
}

Interpreter / Scheduler

  • On each timestep
    • Run every thread until a step, join, or return
    • Collect any new threads spawned
    • Repeat until a fixpoint is reached
  • Step the clock
  • Repeat until the main thread returns

The SimCommand eDSL

  • Core ADT type is a Command[R] which describes a testbench operation that terminates with a value of type R
  • Leverage Chisel for RTL IO datatypes
  • Leverage Scala's for-comprehension syntax for monadic composition of Commands
  • 10-20x faster than cocotb and chiseltest

An eDSL For Imperative and Declarative Parametric Stimulus Generation

Hybrid Stimulus Generators

  • Two types of generators
    • Imperative generators (QuickCheck's Gen)
    • Declarative constraint solvers (SystemVerilog constrained random)
  • We propose a hybrid API that:
    • Can mix both generator types
    • Leverages Chisel for hardware datatypes and as a constraint language

Imperative Generation eDSL API


val intGen: Gen[Int] = Gen[Int].range(0, 100)

val seqGen: Gen[Seq[Int]] = for {
  lit <- Gen.range(1, 100)
  tailGen <- Gen.oneOf(Gen(Seq()) -> 0.1, seqGen -> 0.9),
  seqn <- tailGen.map(t => lit +: t)
} yield seqn

Use Scala's for-comprehensions for monadic composition

Generating Chisel Datatypes


val hwUIntGen: Gen[UInt] = Gen[UInt].range(0, 100)

object MemOp extends ChiselEnum
case class MemTx extends Bundle {
  val addr = UInt(32.W)
  val data = UInt(64.W)
  val op = MemOp
}
val memTxGen: Gen[MemTx] = Gen[MemTx].uniform

Leveraging Chisel for Constraints


object MemOp extends ChiselEnum
case class MemTx extends Bundle {
  val addr = UInt(32.W)
  val data = UInt(64.W)
  val op = MemOp
}
val memTxGen: Gen[MemTx] = Gen[MemTx].constrained { memTx =>
  (memTx.op === MemOp.Write) && (addr(2,0) === 0.U)
}

Since Chisel is an eDSL, it can be leveraged for other hardware eDSLs

Parametric Fuzzing


Gen[MemTx].generate(ScalaRandom(seed=10))
Gen[MemTx].generate(ParametricRandom(Seq[Byte](...)))

Unify both imperative and declarative constraint generators and introduce parametric control

The Stimulus Generation eDSL

  • Core ADT type is a Gen[A] which describes a generator of values of type A
  • Leverage Chisel for datatypes and as a constraint language
  • Leverage Scala's for-comprehension syntax for monadic composition of Gens

Chisel-Recipes: A Cycle-Level Imperative Control Flow eDSL

Writing Control Flow Logic

Often, hardware designers manually convert imperative control flow to an explicit FSM

This process is repetitive, mechanical, and error-prone

We can design an eDSL to directly express cycle-level control flow and an interpreter to turn it into RTL

Chisel-Recipes eDSL

  • tick(): advance a cycle
  • action { block }: perform the assignments in the block now
  • whileLoop (cond) { recipe }: loop until cond is false
  • when (cond) { recipe }: if cond is true, execute sub-recipe

def waitUntil(c: Bool) = whileLoop(!c, tick())
def forever(r: Recipe) = whileLoop(true.B, r)

Example

Reading a memory from an AXI-Lite port


val readOnce = recipe(
  waitUntil(axi.ar_valid === 1.B, active=axi.ar_ready),
  action {
    axi.r_data := RegNext(mem.read(axi.ar_addr))
    axi.r_valid := 1.B
  },
  waitUntil(axi.r_ready === 1.B, active=axi.r_valid),
  tick()
).compile()

Leveraging Chisel and Scala for Debug

Use Scala macros to inject source line instrumentation into eDSL primitives

Use Chisel printf and naming APIs to inject source info into RTL

Implementation

Each primitive is a go-done circuit

Opportunity to use a lightweight HLS IR (e.g. Calyx) to produce optimized FSMs

The Chisel-Recipes eDSL

  • Core ADT type is a Recipe which describes a control flow machine
  • Leverage Chisel for RTL design and generation
  • Leverage Scala's implicits for source instrumentation and the eDSL's frontend API

Host Languages

What makes a good host language for an eDSL?

  • Algebraic data types
  • Flexible and extensible + familiar syntax
  • Monadic composition sugar (or direct style alternatives)
  • Strong macro system for source transformation and instrumentation
  • Good IDE support, stdlib, library ecosystem

Scala (3) is quite good!

Conclusion

  • HDLs implemented as eDSLs provide an extensible foundation for other eDSLs
  • There are many areas of hardware design and verification that would be served well with specialized eDSLs

What new eDSLs should we work on?