OR-1 dataflow CPU sketch

Assembler (asm/)#

Last verified: 2026-03-07

Purpose#

Translates dfasm graph assembly source into emulator-ready configurations. Bridges the gap between human-authored dataflow programs and the emulator's PEConfig/SMConfig/token structures.

Contracts#

  • Exposes: assemble(source) -> AssemblyResult, assemble_to_tokens(source) -> list, run_pipeline(source) -> IRGraph, serialize_graph(IRGraph) -> str, round_trip(source) -> str
  • Guarantees: Pipeline is parse -> lower -> expand -> resolve -> place -> allocate -> codegen. Each pass returns a new IRGraph (immutable pass pattern). Errors accumulate in IRGraph.errors rather than fail-fast. AssemblyResult contains valid PEConfig/SMConfig lists and seed MonadTokens.
  • Expects: Valid dfasm source conforming to dfasm.lark. Raises ValueError if any pipeline stage reports errors.

Pipeline Passes#

  1. Lower (lower.py): Lark CST -> IRGraph. Creates IRNodes, IREdges, IRRegions (function/location scopes), IRDataDefs, SystemConfig from @system pragma. Qualifies names with function scope (e.g., $main.&add). May contain MacroCall nodes and MacroDef regions.
  2. Expand (expand.py): Macro expansion and function call wiring. Clones macro bodies, substitutes parameters (including opcodes via ${op}, placement via |${pe}, ports via :${port}, context slots via [${ctx}]), evaluates const expressions, expands variadic repetition blocks, rewrites @ret/@ret_name macro outputs, qualifies expanded names with scope prefixes. Processes function call sites with @ret trampolines, free_ctx insertion, and cross-context wiring. After expand, IR contains only concrete IRNode/IREdge entries. No ParamRef placeholders, no MacroDef regions, no IRMacroCall entries remain.
  3. Resolve (resolve.py): Validates all edge endpoints exist. Detects scope violations (cross-function label refs). Generates Levenshtein "did you mean" suggestions.
  4. Place (place.py): Validates explicit PE placements. Auto-places unplaced nodes via greedy bin-packing with locality heuristic (prefer PE with most connected neighbours).
  5. Allocate (allocate.py): Assigns IRAM offsets (dyadic first, then monadic). Assigns activation IDs (one per function scope per PE). Computes frame layouts with match/const/dest slot allocation. Assigns frame references (fref) for each instruction. Resolves symbolic destinations to FrameDest(target_pe, offset, act_id, port, token_kind).
  6. Codegen (codegen.py): Generates PEConfig/SMConfig with frame layouts, IRAM, and routing tables. Produces seed tokens (MonadToken) for initialization. Supports direct mode (immediate execution) and token stream mode (initialization via IRAM writes).

Dependencies#

  • Uses: cm_inst (Port, MemOp, ALUOp, Instruction, FrameDest, OutputStyle, TokenKind), tokens (MonadToken, SMToken, PELocalWriteToken, FrameControlToken), sm_mod (Presence, SMCell), emu/types (PEConfig, SMConfig), lark (parser)
  • Used by: Test suite, user programs, dfgraph/ (pipeline, graph_json use ir, lower, resolve, place, allocate, errors, opcodes), monitor/ (backend uses run_pipeline and generate_direct; graph_json uses ir, opcodes)
  • Boundary: emu/ and root-level modules must NEVER import from asm/

Key Decisions#

  • Frozen dataclasses for IR types: follows existing tokens.py/cm_inst.py patterns
  • TypeAwareOpToMnemonicDict and TypeAwareMonadicOpsSet in opcodes.py: required because IntEnum subclasses share numeric values across types (e.g., ArithOp.ADD == 0 == MemOp.READ), so plain dict/set lookups would collide
  • Errors use IRGraph.errors accumulation: all issues are reported rather than stopping at the first error
  • # sigil for macro namespace: avoids collision with other sigils ($, &, @)
  • @ret reserved prefix for return markers: in function bodies, creates trampolines with cross-context routing and free_ctx; in macro bodies, rewrites edges to call-site destinations (no context management)
  • Per-call-site activation ID allocation: each function call site gets its own activation ID on the target PE, managed by CallSite metadata
  • Opcode parameters (${op}) resolved via MNEMONIC_TO_OP: enables generic macros like #reduce_2 add
  • Parameterized qualifiers (|${pe}, :${port}) resolved during expansion via PlacementRef, PortRef
  • Built-in macros prepended to user source: #loop_counted, #loop_while, #permit_inject (variadic), #reduce_2/_3/_4 (parameterized opcode)

Invariants#

  • Each pass returns a new IRGraph; IRGraphs are never mutated after construction
  • Names inside function regions are always qualified: $funcname.&label
  • Macro scopes (#macro_N) don't consume activation IDs: they're inlined label namespaces
  • Expanded names are qualified: #macroname_N.&label for global macros, $func.#macro_N.&label for function-scoped macros
  • Double-scoped names in function call bodies: $func.#macro_N.&label when macro is expanded inside a function call site
  • CallSite metadata drives per-call-site activation ID allocation: each unique call location gets one activation ID on the target PE
  • After expansion, IR contains only concrete IRNode/IREdge entries; no ParamRef, MacroDef, or IRMacroCall entries remain
  • After placement, every IRNode has pe is not None
  • After allocation, every IRNode has iram_offset, act_id, fref, and mode set; destinations are ResolvedDest with concrete FrameDest
  • Frame layouts are computed per activation with match/const/dest/sink slot regions
  • Token stream order is always: SM init -> IRAM writes -> seed tokens

Key Files#

  • __init__.py -- Public API and pipeline orchestration
  • ir.py -- All IR type definitions (IRNode, IREdge, IRGraph, IRRegion, IRDataDef, SystemConfig, SourceLoc, NameRef, ResolvedDest, MacroDef, IRMacroCall, CallSite, etc.)
  • errors.py -- Structured error types with source context (ErrorCategory, AssemblyError, format_error)
  • opcodes.py -- Mnemonic-to-opcode mapping and arity (monadic vs dyadic) classification
  • expand.py -- Macro expansion and function call wiring pass
  • builtins.py -- Built-in macro library (BUILTIN_MACROS string constant)
  • codegen.py -- AssemblyResult dataclass and both code generation modes

Gotchas#

  • MemOp.WRITE arity depends on const: monadic when const is set (cell_addr from const), dyadic when const is None (cell_addr from left operand)
  • RoutingOp.FREE_FRAME (frame deallocation) and MemOp.FREE (SM free) are disambiguated by opcode type: RoutingOp vs MemOp
  • Frame layouts are computed per activation, not per instruction: all nodes in an activation share the same slot map but use different frame slots
  • fref (frame reference offset) is instruction-specific and points to different slot regions depending on instruction mode
  • Dyadic instructions must have iram_offset < matchable_offsets to use matching hardware; exceeding this generates a warning (AC5.8)