OR-1 dataflow CPU sketch
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Token Format Migration Design#

Summary#

This design plan migrates the OR1 dataflow CPU emulator and assembler from an older token encoding scheme to a cleaner one. In the old scheme, the system used a 2-bit type field to distinguish four categories of token: CM (computation), SM (structure memory), IO, and a "system" catch-all that handled IRAM loading, routing configuration, and other control operations. That system category is being eliminated. IO is folded into the SM address space (the IO module becomes a memory-mapped region on SM0), and IRAM loading becomes a new CM token subtype called IRAMWriteToken. The result is a simpler 1-bit discriminator: a token is either headed for a Structure Memory unit or a Computation Module, with no special cases in the network router.

Alongside the routing simplification, the Structure Memory gains a two-tier address model. Addresses below a configurable boundary (default 256) retain the existing I-structure semantics — presence tracking, deferred reads, write-once enforcement, and atomic operations. Addresses at or above that boundary become raw shared storage (T0): no presence checking, no deferral, and a single store shared across all SM instances. A new EXEC opcode reads pre-formed tokens from T0 and injects them directly into the network, providing both the system bootstrap path (loading programs from ROM at reset) and a runtime mechanism for bulk token emission. The migration is implemented bottom-up — type definitions first, then emulator core, then assembler and tools, then test updates — with each phase building on the previous.

Definition of Done#

Migrate the OR1 emulator, assembler, and supporting tools from the old 2-bit type field token format to the 1-bit SM/CM split with prefix encoding, as specified in the updated design notes. The old token hierarchy (SysToken, CfgToken, LoadInstToken, RouteSetToken, IOToken) is eliminated. SM gains memory tier support (T0 shared raw storage, T1 per-SM I-structure) and new opcodes. All existing tests pass (updated as needed) and new tests cover the changed functionality.

Specifically:

  • tokens.py: Remove SysToken, CfgToken, IOToken, LoadInstToken, RouteSetToken. Add IRAMWriteToken as a CMToken subclass. DyadToken retains wide: bool field.
  • cm_inst.py: Remove CfgOp enum. Extend MemOp with new opcodes (EXEC, SET_PAGE, WRITE_IMM, RAW_READ, EXT) as a flat internal enum (no encoding tier enforcement).
  • emu/sm.py: T0/T1 memory tier split. T0 is a shared list[Token] across all SMs, with a configurable boundary (default: address 256). New SM opcodes implemented (at minimum EXEC for bootstrap, RAW_READ). Presence metadata widened to 4 bits (presence:2 + is_wide:1 + spare:1).
  • emu/pe.py: Handle IRAMWriteToken instead of CfgToken/LoadInstToken. Remove RouteSetToken handling.
  • emu/network.py: Simplify routing to 1-bit SM/CM isinstance check. Remove IOToken/CfgToken routing.
  • asm/: Update codegen to emit IRAMWriteToken. Remove all CfgOp references. Add T0 boundary awareness for SM address allocation.
  • dfgraph/: Update CfgOp category handling.
  • All tests pass.

Acceptance Criteria#

token-migration.AC1: Old token types removed#

  • token-migration.AC1.1 Success: tokens.py has no SysToken, CfgToken, IOToken, LoadInstToken, or RouteSetToken classes
  • token-migration.AC1.2 Success: cm_inst.py has no CfgOp enum
  • token-migration.AC1.3 Success: No module in the codebase imports any deleted type

token-migration.AC2: IRAMWriteToken works#

  • token-migration.AC2.1 Success: IRAMWriteToken routes to target PE via network (isinstance CMToken)
  • token-migration.AC2.2 Success: PE receives IRAMWriteToken and writes instructions to IRAM at the specified offset
  • token-migration.AC2.3 Success: PE executes instructions loaded via IRAMWriteToken correctly
  • token-migration.AC2.4 Failure: IRAMWriteToken with invalid target PE raises or is dropped

token-migration.AC3: MemOp enum updated#

  • token-migration.AC3.1 Success: MemOp contains EXEC, EXT, SET_PAGE, WRITE_IMM, RAW_READ, CLEAR with correct tier grouping
  • token-migration.AC3.2 Success: ALLOC, FREE remain in tier 1 (3-bit); CLEAR in tier 2 (5-bit)
  • token-migration.AC3.3 Success: Assembler mnemonic mapping includes all new opcodes

token-migration.AC4: SM T0/T1 tier split#

  • token-migration.AC4.1 Success: SM operations on addresses below tier_boundary use I-structure semantics (presence tracking, deferred reads)
  • token-migration.AC4.2 Success: SM WRITE to T0 address stores data without presence checking
  • token-migration.AC4.3 Success: SM READ on T0 address returns immediately (no deferral)
  • token-migration.AC4.4 Success: T0 storage is shared — all SMs reference the same T0 store
  • token-migration.AC4.5 Failure: I-structure ops (CLEAR, ALLOC, FREE, atomics) on T0 address produce error
  • token-migration.AC4.6 Edge: Tier boundary is configurable via SMConfig; default is 256

token-migration.AC5: EXEC opcode#

  • token-migration.AC5.1 Success: EXEC reads Token objects from T0 starting at given address and injects them into the network
  • token-migration.AC5.2 Success: Injected tokens are processed normally by target PEs/SMs
  • token-migration.AC5.3 Success: EXEC can load a program (IRAM writes + seed tokens) from T0 that executes correctly
  • token-migration.AC5.4 Edge: EXEC on empty T0 region is a no-op

token-migration.AC6: Presence metadata widened#

  • token-migration.AC6.1 Success: SMCell has is_wide field (default False)
  • token-migration.AC6.2 Success: Existing I-structure behaviour unchanged (is_wide=False path)

token-migration.AC7: Assembler updated#

  • token-migration.AC7.1 Success: Token stream mode emits IRAMWriteToken (not LoadInstToken)
  • token-migration.AC7.2 Success: Token stream mode does not emit RouteSetToken
  • token-migration.AC7.3 Success: Direct mode (PEConfig/SMConfig) still works
  • token-migration.AC7.4 Success: Assembler round-trip (serialize -> parse -> assemble) works with updated types

token-migration.AC8: All tests pass#

  • token-migration.AC8.1 Success: python -m pytest tests/ -v exits with zero failures

Glossary#

  • Token: The fundamental unit of data and control in OR1. Carries a value and a destination address; computation fires when all required tokens for an instruction arrive.
  • CM (Computation Module): A processing element in the dataflow array. Receives tokens, matches operand pairs, executes IRAM instructions, and emits result tokens.
  • SM (Structure Memory): A memory controller with I-structure semantics. Manages presence state per cell, handles deferred reads, and coordinates producer-consumer synchronisation without locks.
  • DyadToken / MonadToken: The two standard CM token subtypes. Dyadic instructions require two operands (matched before firing); monadic require one.
  • IRAMWriteToken: New CM token subtype introduced by this migration. Carries a block of instructions and a target IRAM address. Replaces LoadInstToken.
  • IRAM: Instruction RAM. Each PE has a small local memory storing the instructions it executes when a matching pair fires. Loaded at startup or patched at runtime via IRAMWriteToken.
  • SysToken / CfgToken / LoadInstToken / RouteSetToken / IOToken: The old token types being deleted. Together they formed the "system" category (type-11 in the old 2-bit encoding).
  • I-structure: A single-assignment memory cell with four states: EMPTY, RESERVED, FULL, WAITING. A READ on an empty cell defers until a WRITE arrives.
  • T0 (tier 0): Raw shared storage in the SM address space. No presence bits, no deferred reads. Shared across all SM instances. Used for program storage and the EXEC bootstrap region.
  • T1 (tier 1): The per-SM I-structure region of SM address space, below the tier boundary. Full presence tracking and deferred-read semantics.
  • EXEC opcode: A new SM operation that reads Token objects from a T0 region and injects them into the network. Used for system bootstrap and runtime bulk token emission.
  • Tier boundary: A configurable address (default: 256) that divides T0 from T1 within an SM's address space.
  • MemOp: The enum in cm_inst.py listing all SM opcodes. This migration adds EXEC, SET_PAGE, WRITE_IMM, RAW_READ, and EXT.
  • CfgOp: The enum being deleted. Listed configuration opcodes (LOAD_INST, ROUTE_SET) that backed the old CfgToken mechanism.
  • Matching store: The per-PE 2D array that holds one half of a dyadic token pair while waiting for the other. Indexed by context slot and IRAM offset.
  • SimPy: A Python discrete-event simulation library. OR1's emulator is implemented as SimPy processes communicating via SimPy Stores.
  • Frozen dataclass: A Python dataclass with frozen=True, making instances immutable and hashable. All token types use this pattern.
  • Lark: A Python parsing toolkit used to implement the dfasm grammar.
  • dfasm: The assembly language for OR1 dataflow graphs. Translated by the assembler into emulator configuration and token streams.
  • dfgraph: The interactive graph renderer for dfasm files. FastAPI server with TypeScript/Cytoscape.js frontend.

Architecture#

Bottom-up migration: change type definitions first (tokens.py, cm_inst.py), then fix all consumers (emulator, assembler, graph renderer, tests).

Token Hierarchy (new)#

Token(target: int)
├── CMToken(Token) — offset, ctx, data
│   ├── DyadToken(CMToken) — port, gen, wide
│   ├── MonadToken(CMToken) — inline
│   └── IRAMWriteToken(CMToken) — instructions: tuple[ALUInst | SMInst, ...]
└── SMToken(Token) — addr, op, flags, data, ret

IRAMWriteToken inherits CMToken so network routing (isinstance(token, CMToken)) sends it to PEs without special cases. The offset field serves as iram_addr; ctx and data are set to 0 (unused but satisfying the frozen dataclass contract).

SysToken, CfgToken, LoadInstToken, RouteSetToken, IOToken are deleted. DyadToken retains wide: bool — the prefix distinction (dyadic wide vs narrow) is an encoding detail, not a semantic one at the emulator level.

MemOp Enum (new)#

Flat sequential enum. The tier 1 / tier 2 grouping documents the intended hardware encoding but is not enforced in the emulator:

Tier 1 (3-bit opcode, 10-bit addr — full 1024-cell range):
  READ=0, WRITE=1, EXEC=2, ALLOC=3, FREE=4, EXT=5

Tier 2 (5-bit opcode, 8-bit payload — 256-cell range):
  CLEAR=6, RD_INC=7, RD_DEC=8, CMP_SW=9, RAW_READ=10,
  SET_PAGE=11, WRITE_IMM=12

EXEC is in the 3-bit tier because it needs full address range to reach T0 at high addresses. CLEAR is in the 5-bit tier because it is purely an I-structure operation (presence state reset) that only applies to T1 cells in lower address space.

CfgOp enum is deleted entirely. IRAM writes are identified by isinstance(token, IRAMWriteToken).

SM Memory Tier Model#

SM address space splits into two tiers at a configurable boundary (default: 256):

  • T1 (below boundary): Per-SM I-structure cells with presence tracking, deferred reads, atomic ops. SM_id is conceptually part of the address — each SM owns its own T1 space independently. Current list[SMCell] model, unchanged.

  • T0 (at/above boundary): Shared raw storage across all SMs. No presence tracking, no deferred reads. Modelled as a single list[Token] referenced by all SM instances. T0 is the same physical address space regardless of which SM receives the request.

T0 is initially a list of Token objects (not bytes/words) to avoid encoding/decoding in the emulator. EXEC iterates T0 and injects tokens into the network. Future work will swap T0 to list[int] (16-bit words) when modelling programs that access T0 as normal data.

T0 operations: WRITE stores into T0 (no presence check). EXEC reads from T0 and injects tokens. All I-structure-specific operations (CLEAR, ALLOC, FREE, atomics) on T0 addresses are errors. READ on T0 returns immediately (no deferral).

EXEC flow: SM receives SMToken(op=EXEC, addr=start), iterates t0_store[start:], calls system.send(token) for each entry. SM holds a reference to the System object (set by build_topology()).

Presence Metadata#

SMCell gains is_wide: bool = False for the widened 4-bit presence metadata (presence:2 + is_wide:1 + spare:1). The spare bit is unused. Presence enum values are unchanged (EMPTY/RESERVED/FULL/WAITING).

Network Routing#

System._target_store() simplifies to:

SMToken  → sms[token.target].input_store
CMToken  → pes[token.target].input_store

No special cases for CfgToken, IOToken, or any other subtype. IRAMWriteToken routes to PEs automatically as a CMToken subclass.

PE Token Handling#

The PE's _run() loop replaces:

isinstance(token, CfgToken) → _handle_cfg()

with:

isinstance(token, IRAMWriteToken) → _handle_iram_write()

_handle_iram_write writes token.instructions into IRAM at token.offset. RouteSetToken handling is deleted (route restriction dropped; full mesh is the v0 default). Route restriction will return later via a different mechanism when multi-cluster topologies are needed.

Assembler Token Emission#

asm/codegen.py token stream mode currently emits: SM init → ROUTE_SET → LOAD_INST → seeds

New ordering: SM init → IRAM writes → seeds

RouteSetToken emission is removed. LoadInstToken emission becomes IRAMWriteToken emission with target=pe_id, offset=iram_addr, ctx=0, data=0, instructions=(...).

All CfgOp references removed from asm/opcodes.py, asm/lower.py, and asm/codegen.py. New MemOp members added to mnemonic mapping in opcodes.py.

dfgraph Category Handling#

dfgraph/categories.py removes isinstance(op, CfgOp) branch from categorise(). New MemOp values (EXEC, SET_PAGE, etc.) either fall into the existing SM category or a new "control" subcategory.

Existing Patterns#

Investigation found the following patterns in the codebase:

  • Token hierarchy follows frozen dataclass inheritance (tokens.py). New IRAMWriteToken follows the same pattern.
  • SM operations use match token.op: dispatch in _run() loop (emu/sm.py). New opcodes follow this pattern.
  • Network routing uses isinstance dispatch (emu/network.py). Simplified routing follows the same pattern with fewer branches.
  • PE config handling uses isinstance dispatch for token subtypes (emu/pe.py). IRAMWriteToken replaces CfgToken in this dispatch.
  • Assembler codegen constructs token objects directly (asm/codegen.py). IRAMWriteToken replaces LoadInstToken/RouteSetToken construction.
  • Config types use frozen dataclasses (emu/types.py). SMConfig gains tier_boundary field following the same pattern.

No new patterns are introduced. All changes follow existing conventions.

Implementation Phases#

Phase 1: Type Definitions#

Goal: Update token hierarchy and instruction set enums to the new format.

Components:

  • tokens.py — delete SysToken/CfgToken/IOToken/LoadInstToken/RouteSetToken; add IRAMWriteToken
  • cm_inst.py — delete CfgOp enum; update MemOp with new opcodes and sequential values
  • sm_mod.py — add is_wide: bool field to SMCell

Dependencies: None (first phase).

Done when: Type definitions compile. Importing tokens and cm_inst does not error. All downstream breakage is expected and addressed in subsequent phases.

Phase 2: Emulator Core#

Goal: Update emulator to use new token types and add SM memory tier support.

Components:

  • emu/types.py — add tier_boundary: int = 256 to SMConfig
  • emu/sm.py — T0/T1 dispatch based on address vs tier_boundary; T0 operations (WRITE, READ, EXEC); accept t0_store and system references; add EXEC implementation; unimplemented new opcodes raise error
  • emu/pe.py — replace CfgToken/LoadInstToken/RouteSetToken handling with IRAMWriteToken; remove _handle_cfg; add _handle_iram_write
  • emu/network.py — simplify _target_store to SMToken/CMToken isinstance; create shared T0 store in build_topology; pass T0 store and System reference to all SMs
  • emu/__init__.py — update exports if needed

Dependencies: Phase 1 (type definitions).

Done when: Emulator processes IRAMWriteToken correctly. SM handles T0/T1 split. EXEC reads from T0 and injects tokens. Existing emulator behaviour for T1 operations is preserved. Tests covering these behaviours pass.

Phase 3: Assembler and Tools#

Goal: Update assembler and dfgraph to use new types.

Components:

  • asm/codegen.py — emit IRAMWriteToken instead of LoadInstToken/RouteSetToken; remove ROUTE_SET from token stream ordering; remove CfgOp import
  • asm/opcodes.py — remove CfgOp from MNEMONIC_TO_OP and type-aware containers; add new MemOp mnemonics (exec, set_page, write_imm, raw_read, ext, clear updated value)
  • asm/lower.py — remove CfgOp from opcode() return type
  • dfgraph/categories.py — remove CfgOp isinstance branch; handle new MemOp values

Dependencies: Phase 1 (type definitions).

Done when: Assembler produces IRAMWriteToken in token stream mode. All assembler pipeline stages work with updated types. dfgraph categorises new opcodes. Tests pass.

Phase 4: Test Updates and New Coverage#

Goal: Update all broken tests and add coverage for new functionality.

Components:

  • tests/test_pe.py — delete RouteSetToken tests; update LoadInstToken tests to IRAMWriteToken
  • tests/test_codegen.py — update CfgToken/RouteSetToken/LoadInstToken assertions to IRAMWriteToken
  • tests/test_integration.py — update TestTask5CfgTokenLoadInst to use IRAMWriteToken
  • tests/test_e2e.py — update CfgToken isinstance check to IRAMWriteToken
  • tests/test_opcodes.py — remove CfgOp assertions; add new MemOp assertions
  • tests/test_dfgraph_categories.py — remove TestCategoriseCfgOp; add new MemOp tests
  • tests/conftest.py — update Hypothesis strategies if they generate old token types
  • New tests for: T0/T1 tier split, EXEC opcode, IRAMWriteToken routing, T0 boundary enforcement

Dependencies: Phases 2 and 3 (emulator and assembler updated).

Done when: python -m pytest tests/ -v passes with zero failures. New tests cover T0/T1 split, EXEC, and IRAMWriteToken routing.

Phase 5: Documentation#

Goal: Update project documentation to reflect new token format.

Components:

  • CLAUDE.md — update Token Hierarchy, Architecture Contracts, and Module Dependency Graph sections
  • asm/CLAUDE.md — update Dependencies and Invariants sections (remove CfgOp/RouteSetToken/LoadInstToken references, add IRAMWriteToken)
  • dfgraph/CLAUDE.md — update category references if present

Dependencies: Phase 4 (all code changes complete and tested).

Done when: CLAUDE.md files accurately describe the current codebase state. No references to deleted types remain in documentation.

Additional Considerations#

Route restriction (deferred): RouteSetToken and route restriction are removed in this migration. Route restriction will return in a future design when multi-cluster (4-CM cluster) topologies are needed, likely as a misc-bucket subtype (011+11) or a PE config register mechanism.

T0 evolution: T0 is initially modelled as list[Token] for simplicity. When programs need to access T0 as normal data (framebuffers, lookup tables), T0 will migrate to list[int] (16-bit words) with token serialisation/deserialisation. This is a separate future design.

Unimplemented opcodes: SET_PAGE, WRITE_IMM, RAW_READ, and EXT exist as MemOp enum members but raise NotImplementedError if they reach the SM's run loop. They are placeholders for future SM features and ensure the assembler can parse them.