Emulator Redesign Plan: Frame-Based PE Model#

This document describes the changes required in the OR1 emulator code to match the frame-based PE redesign specified in design-notes/pe-design.md (sections 3-11) and design-notes/architecture-overview.md (token format section). It is scoped to the emulator (emu/, tokens.py, cm_inst.py) and immediate downstream consumers (monitor/). Assembler changes are out of scope.

1. Overview#

The current emulator models a context/generation-based matching scheme with flat matching_store[ctx][offset] arrays, generation counters for ABA protection, and Addr-based output routing that constructs tokens at emit time. The target design replaces this with:

Frame-based matching: a tag store maps activation_id to frame_id; presence/port metadata is per-frame, per-matchable-offset. Operands, constants, and destinations all live in frame slots (flat SRAM array).
Reversed pipeline: IFETCH runs before MATCH, so the instruction word drives match behaviour.
Unified instruction format: 16-bit [type:1][opcode:5][mode:3][wide:1][fref:6] replaces the separate ALUInst/SMInst dataclasses. Mode encodes output routing (INHERIT/CHANGE_TAG/SINK) and frame access pattern.
Pre-formed destinations: output flit 1 is a 16-bit value read from a frame slot, not constructed from Addr fields at emit time.
New token types: FrameControlToken (ALLOC/FREE) and PELocalWriteToken (IRAM write + frame slot write) replace IRAMWriteToken.
ctx/gen replaced by act_id: 3-bit activation ID with ABA protection via tag store valid bits, not generation counters.

2. Token Hierarchy Changes (`tokens.py`)#

Current Hierarchy#

Token(target: int)
  CMToken(Token):          offset: int, ctx: int, data: int
    DyadToken(CMToken):    port: Port, gen: int, wide: bool
    MonadToken(CMToken):   inline: bool
    IRAMWriteToken(CMToken): instructions: tuple[ALUInst | SMInst, ...]
  SMToken(Token):          addr: int, op: MemOp, flags: Optional[int],
                           data: Optional[int], ret: Optional[CMToken]

Target Hierarchy#

Token(target: int)
  CMToken(Token):              offset: int, act_id: int, data: int
    DyadToken(CMToken):        port: Port, wide: bool
    MonadToken(CMToken):       inline: bool
  FrameControlToken(Token):    pe: int, act_id: int, op: FrameOp, payload: int
  PELocalWriteToken(Token):    pe: int, act_id: int, region: int,
                               slot: int, data: int
  SMToken(Token):              addr: int, op: MemOp, flags: Optional[int],
                               data: Optional[int], ret: Optional[CMToken]

Specific Field Changes#

Change	Detail
`CMToken.ctx` renamed to `CMToken.act_id`	3-bit activation ID (0-7)
`DyadToken.gen` removed	ABA protection via tag store valid bit, not generation counters
`IRAMWriteToken` removed	Replaced by `PELocalWriteToken` with `region=0`
`FrameControlToken` added	New token type for ALLOC (`op=0`) and FREE (`op=1`). `payload` carries return routing for ALLOC confirmation.
`PELocalWriteToken` added	Unified IRAM/frame write. `region=0`: IRAM write at `slot` address. `region=1`: frame write at `(act_id, slot)`.
`FrameOp` enum added	`ALLOC = 0`, `FREE = 1`

What Stays the Same#

Token base class (unchanged)
SMToken (unchanged; ret field still holds a CMToken template)
Port enum (unchanged)
MonadToken.inline (unchanged)
DyadToken.wide (unchanged)

Notes#

FrameControlToken and PELocalWriteToken do NOT inherit from CMToken because they lack offset and data fields. They inherit directly from Token. The target field from Token serves as the PE destination (equivalent to the pe field in the bit layout; one of these is redundant and should be reconciled -- either use target as the PE ID or add a separate pe field).
CMToken.act_id replaces CMToken.ctx everywhere. All downstream code referencing token.ctx must change to token.act_id.

3. ISA Changes (`cm_inst.py`)#

Current Instruction Types#

ALUInst(op: ALUOp, dest_l: Optional[Addr], dest_r: Optional[Addr],
        const: Optional[int], ctx_mode: int = 0)

SMInst(op: MemOp, sm_id: int, const: Optional[int] = None,
       ret: Optional[Addr] = None, ret_dyadic: bool = False)

Addr(a: int, port: Port, pe: Optional[int])

Target Instruction Type#

@dataclass(frozen=True)
class Instruction:
    type: int           # 0 = CM (ALU), 1 = SM
    opcode: ALUOp | MemOp
    mode: int           # 0-7, see mode table
    wide: bool          # 16-bit vs 32-bit frame values
    fref: int           # frame slot base index (0-63)

This is a unified dataclass matching the 16-bit hardware instruction word [type:1][opcode:5][mode:3][wide:1][fref:6].

Mode Enum#

class Mode(IntEnum):
    INHERIT_DEST       = 0  # frame[fref] = dest, no const
    INHERIT_CONST_DEST = 1  # frame[fref] = const, frame[fref+1] = dest
    INHERIT_FANOUT     = 2  # frame[fref] = dest1, frame[fref+1] = dest2
    INHERIT_CONST_FAN  = 3  # frame[fref] = const, frame[fref+1..+2] = dest1, dest2
    CHANGE_TAG         = 4  # flit 1 from left operand, no const
    CHANGE_TAG_CONST   = 5  # flit 1 from left operand, frame[fref] = const
    SINK               = 6  # write result -> frame[fref], no output
    SINK_CONST_RMW     = 7  # read frame[fref] as const, write result back

Decode Equations (for PE emulator)#

output_enable = mode < 4        # modes 0-3
change_tag    = mode in (4, 5)  # modes 4-5
sink          = mode >= 6       # modes 6-7
has_const     = mode & 1        # modes 1, 3, 5, 7
has_fanout    = mode in (2, 3)  # modes 2-3

What Happens to Existing Types#

Type	Disposition
`ALUOp` hierarchy (`ArithOp`, `LogicOp`, `RoutingOp`)	Stays. Opcode values used in `Instruction.opcode` for CM type.
`MemOp`	Stays. Used in `Instruction.opcode` for SM type.
`ALUInst`	Removed. Replaced by `Instruction`.
`SMInst`	Removed. Replaced by `Instruction` with `type=1`. SM-specific parameters (target SM, address, return routing) move to frame slots.
`Addr`	Stays as assembler concept in `asm/`. Not used in `Instruction` or at emulator level. Destinations are pre-formed flit 1 values in frame slots.
`ctx_mode` field on `ALUInst`	Removed. Output context is determined by mode (INHERIT uses frame dest, CHANGE_TAG uses left operand).
`RoutingOp.FREE_CTX`	Renamed to `RoutingOp.FREE_FRAME`. Triggers frame deallocation.
`is_monadic_alu()`	Stays. Still needed for the PE to determine whether a DyadToken at a monadic instruction bypasses matching.

New Addition: EXTRACT_TAG#

A new RoutingOp.EXTRACT_TAG value for capturing runtime act_id + offset as a data value (packed flit 1). The ALU returns a packed 16-bit value encoding [prefix][port][PE_id][offset][act_id].

4. PE Emulator Changes (`emu/pe.py`)#

This is the largest change. The ProcessingElement class is substantially rewritten.

Constructor Changes#

Remove:

iram: dict[int, ALUInst | SMInst] parameter type
ctx_slots: int parameter
offsets: int parameter
matching_store: list[list[MatchEntry]] attribute
gen_counters: list[int] attribute
_ctx_slots, _offsets internal attributes

Add:

iram: dict[int, Instruction] parameter type
frame_count: int = 4 parameter (number of physical frames)
frame_slots: int = 64 parameter (slots per frame)
matchable_offsets: int = 8 parameter (dyadic-capable offsets per frame)
frames: list[list[int]] attribute -- [frame_count][frame_slots] SRAM
tag_store: dict[int, int] attribute -- act_id -> frame_id mapping (models the 670 lookup; None or absent = invalid)
presence: list[list[bool]] attribute -- [frame_count][matchable_offsets]
port_store: list[list[Port]] attribute -- [frame_count][matchable_offsets]
free_frames: list[int] attribute -- free frame ID pool

Pipeline Reorder: IFETCH Before MATCH#

Current order (_process_token):

Classify token type
Match (dyadic) or pass through (monadic)
Fetch instruction from IRAM
Execute (ALU or SM)
Emit output

Target order (_process_token):

Classify token type; side-path frame control and PE-local writes
IFETCH: read instruction from IRAM at token.offset
Resolve act_id -> frame_id via tag store (parallel with IFETCH in HW; sequential in emulator but within same cycle)
MATCH/FRAME: use instruction to drive match behaviour
- Dyadic + presence set: read stored operand from frame, clear presence
- Dyadic + presence clear: write operand to frame, set presence, consume token
- Monadic: bypass matching; read constant from frame if has_const
EXECUTE: ALU or SM token construction
OUTPUT: read destination(s) from frame slots, emit tokens

Side Paths (New)#

Two new token types are handled before the main pipeline:

FrameControlToken handling (replaces nothing; new capability):

def _handle_frame_control(self, token: FrameControlToken) -> None:
    if token.op == FrameOp.ALLOC:
        frame_id = self._alloc_frame()
        self.tag_store[token.act_id] = frame_id
        # Clear presence bits for the new frame
        self.presence[frame_id] = [False] * self.matchable_offsets
        # Optionally emit confirmation token
    elif token.op == FrameOp.FREE:
        frame_id = self.tag_store.pop(token.act_id, None)
        if frame_id is not None:
            self.free_frames.append(frame_id)
            self.presence[frame_id] = [False] * self.matchable_offsets

PELocalWriteToken handling (replaces _handle_iram_write):

def _handle_pe_local_write(self, token: PELocalWriteToken) -> None:
    if token.region == 0:  # IRAM write
        self.iram[token.slot] = decode_instruction(token.data)
    elif token.region == 1:  # Frame write
        frame_id = self.tag_store.get(token.act_id)
        if frame_id is not None:
            self.frames[frame_id][token.slot] = token.data

Method-Level Changes#

Current Method	Change
`_run()`	Minimal change: add dispatch for `FrameControlToken`, `PELocalWriteToken`
`_process_token()`	Rewrite: IFETCH before MATCH, mode-driven frame access
`_handle_iram_write()`	Remove. Replaced by `_handle_pe_local_write()`
`_match_monadic()`	Remove. Monadic path integrated into MATCH/FRAME stage
`_match_dyadic()`	Rewrite as `_match_frame()`: uses tag store, presence bits, frame SRAM
`_fetch()`	Stays (same: `self.iram.get(offset)`)
`_is_monadic_instruction()`	Stays (needed to decide whether DyadToken bypasses match)
`_do_emit()`	Rewrite: mode-driven output routing from frame slots
`_build_and_emit_sm()`	Rewrite: SM parameters from frame slots, not from `SMInst` fields
`_deliver()`	Stays (unchanged)
`_output_mode()`	Rewrite: derive from `Instruction.mode` instead of inspecting `ALUInst` fields
`_make_output_token()`	Rewrite: read pre-formed flit 1 from frame slot, not construct from `Addr`
New: `_alloc_frame()`	Allocate next free frame_id from `free_frames` pool
New: `_free_frame()`	Release frame_id back to pool, clear presence
New: `_handle_frame_control()`	Process ALLOC/FREE tokens
New: `_handle_pe_local_write()`	Process IRAM/frame writes
New: `_read_frame_slot()`	Read `frames[frame_id][slot]`
New: `_write_frame_slot()`	Write `frames[frame_id][slot]`

Output Routing: Mode-Driven#

Current: _output_mode() inspects inst.op (FREE_CTX, GATE, SW*) and inst.dest_l/inst.dest_r to determine SUPPRESS/SINGLE/DUAL/SWITCH. _make_output_token() constructs a DyadToken from Addr fields.

Target: the Instruction.mode field determines output behaviour:

def _do_output(self, inst: Instruction, result: int, bool_out: bool,
               frame_id: int, left: int) -> None:
    if inst.mode >= 6:  # SINK
        # Write result back to frame[fref]
        self.frames[frame_id][inst.fref] = result & 0xFFFF
        return

    if inst.opcode == RoutingOp.FREE_FRAME:
        self._free_frame_by_inst(frame_id)
        return

    if inst.opcode == RoutingOp.GATE and not bool_out:
        return  # suppressed

    if inst.mode in (4, 5):  # CHANGE_TAG
        # flit 1 from left operand (pre-formed destination)
        flit1 = left
    else:  # INHERIT (modes 0-3)
        # Read destination from frame
        if inst.mode & 1:  # has_const: dest is at fref+1
            flit1 = self.frames[frame_id][inst.fref + 1]
        else:
            flit1 = self.frames[frame_id][inst.fref]

    # Decode flit1 to extract target PE, offset, act_id, port
    out_token = self._flit1_to_token(flit1, result)
    self._emit(out_token)

    if inst.mode in (2, 3):  # has_fanout: second destination
        if inst.mode == 3:  # const+fan: dest2 at fref+2
            flit1_2 = self.frames[frame_id][inst.fref + 2]
        else:  # fan: dest2 at fref+1
            flit1_2 = self.frames[frame_id][inst.fref + 1]
        out_token_2 = self._flit1_to_token(flit1_2, result)
        self._emit(out_token_2)

_flit1_to_token(flit1: int, data: int) -> CMToken: decodes a 16-bit pre-formed flit 1 value into a DyadToken or MonadToken based on its prefix bits. This replaces _make_output_token() which constructed tokens from Addr fields.

def _flit1_to_token(self, flit1: int, data: int) -> CMToken:
    """Decode a pre-formed flit 1 value into a CMToken."""
    bit15 = (flit1 >> 15) & 1
    if bit15:
        # SM token -- should not appear as a CM destination
        raise ValueError("SM destination in CM output path")

    bit14 = (flit1 >> 14) & 1
    if bit14 == 0:
        # Dyadic wide: [0][0][port:1][PE:2][offset:8][act_id:3]
        port = Port((flit1 >> 13) & 1)
        pe = (flit1 >> 11) & 0x3
        offset = (flit1 >> 3) & 0xFF
        act_id = flit1 & 0x7
        return DyadToken(target=pe, offset=offset, act_id=act_id,
                         data=data, port=port, wide=False)

    bit13 = (flit1 >> 13) & 1
    if bit13 == 0:
        # Monadic normal: [0][1][0][PE:2][offset:8][act_id:3]
        pe = (flit1 >> 11) & 0x3
        offset = (flit1 >> 3) & 0xFF
        act_id = flit1 & 0x7
        return MonadToken(target=pe, offset=offset, act_id=act_id,
                          data=data, inline=False)

    # Misc bucket: [0][1][1][PE:2][sub:2][...]
    sub = (flit1 >> 9) & 0x3
    if sub == 2:
        # Monadic inline: [0][1][1][PE:2][10][offset:7][spare:2]
        pe = (flit1 >> 11) & 0x3
        offset = (flit1 >> 2) & 0x7F
        return MonadToken(target=pe, offset=offset, act_id=0,
                          data=0, inline=True)

    raise ValueError(f"Unexpected flit1 misc sub={sub} in output path")

SWITCH Output (Branch/Switch Routing Ops)#

For SWITCH operations (SWEQ, SWGT, SWGE, SWOF), the output routing uses the same mode-driven frame read for the taken path. The not-taken path emits a monadic inline token. The destination for the not-taken path comes from:

Mode 2/3 (fan-out): the second destination frame slot
The bool_out determines which destination is taken vs not-taken

SM Operation Changes#

SM operations (Instruction.type == 1) read SM parameters from frame slots instead of from SMInst fields:

frame[fref]: SM target slot (packed SM_id + address)
frame[fref+1]: return routing slot (pre-formed CM token flit 1), for ops that produce results

The PE constructs SMToken by unpacking frame slot contents:

def _build_sm_token(self, inst: Instruction, frame_id: int,
                    left: int, right: int | None) -> SMToken:
    target_slot = self.frames[frame_id][inst.fref]
    sm_id = (target_slot >> 14) & 0x3
    addr = (target_slot >> 4) & 0x3FF  # tier 1: 10-bit addr

    ret = None
    if inst.mode & 1:  # has return routing in frame[fref+1]
        ret_flit1 = self.frames[frame_id][inst.fref + 1]
        ret = self._flit1_to_token(ret_flit1, data=0)

    return SMToken(target=sm_id, addr=addr, op=inst.opcode,
                   flags=..., data=..., ret=ret)

5. PE Config Changes (`emu/types.py`)#

Remove#

MatchEntry dataclass (entire class)
PEConfig.ctx_slots field
PEConfig.offsets field
PEConfig.gen_counters field

Change#

PEConfig.iram type: dict[int, ALUInst | SMInst] becomes dict[int, Instruction]

Add#

@dataclass(frozen=True)
class PEConfig:
    pe_id: int
    iram: dict[int, Instruction]
    frame_count: int = 4
    frame_slots: int = 64
    matchable_offsets: int = 8
    initial_frames: Optional[dict[int, dict[int, int]]] = None
    # act_id -> {slot_index -> value} for pre-loaded frame contents
    initial_tag_store: Optional[dict[int, int]] = None
    # act_id -> frame_id mappings pre-loaded at init
    allowed_pe_routes: Optional[set[int]] = None
    allowed_sm_routes: Optional[set[int]] = None
    on_event: EventCallback | None = None

initial_frames replaces the old pattern of encoding constants and destinations in ALUInst.const / ALUInst.dest_l / ALUInst.dest_r. Frame contents are loaded before simulation starts or via PELocalWriteToken during simulation.

initial_tag_store replaces gen_counters for pre-configuring which act_ids are valid at simulation start.

Keep#

DeferredRead dataclass (used by SM, not PE)
SMConfig dataclass (unchanged)

6. ALU Changes (`emu/alu.py`)#

Minimal changes. The ALU is a pure function that does not know about frames or tokens.

Specific Changes#

Change	Detail
`RoutingOp.FREE_CTX`	Rename to `RoutingOp.FREE_FRAME` in the match case
Add `RoutingOp.EXTRACT_TAG` handling	Returns a packed flit 1 value. The ALU needs PE_id and act_id as additional inputs, or EXTRACT_TAG is handled in the PE before the ALU call.

Const Source#

Current: const comes from ALUInst.const (instruction field). Target: const comes from a frame slot (frame[frame_id][inst.fref] when has_const). This is transparent to the ALU -- the PE reads the constant from the frame and passes it to execute() the same way.

The execute() signature stays the same:

def execute(op: ALUOp, left: int, right: int | None, const: int | None) -> tuple[int, bool]:

7. SM Changes (`emu/sm.py`)#

Core I-structure semantics are unchanged. Minor adjustments only.

Return Token Construction#

Current: _send_result() uses dataclasses.replace(return_route, data=data) to set the data field on the pre-formed return route CMToken.

Target: same pattern, but the return route CMToken now has act_id instead of ctx, and DyadToken no longer has a gen field. Since the return route is constructed by the PE and embedded in the SMToken.ret field, the SM just uses replace() as before -- the field name change (ctx -> act_id) is transparent to the SM's _send_result logic because it operates on the token object generically.

No Other Changes#

Cell states (Presence), deferred reads, atomics, T0/T1 tiers: all unchanged.
StructureMemory constructor, _run() loop, all handlers: unchanged.

8. Network Changes (`emu/network.py`)#

Token Routing#

Current: _target_store() routes SMToken to SM, CMToken to PE. IRAMWriteToken routes to PE (inherits from CMToken).

Target: _target_store() must also handle:

FrameControlToken -> target PE (route on token.target or token.pe)
PELocalWriteToken -> target PE (route on token.target or token.pe)

Since these new types do not inherit from CMToken, add explicit isinstance checks:

def _target_store(self, token: Token) -> simpy.Store:
    if isinstance(token, SMToken):
        return self.sms[token.target].input_store
    if isinstance(token, (CMToken, FrameControlToken, PELocalWriteToken)):
        return self.pes[token.target].input_store
    raise TypeError(f"Unknown token type: {type(token).__name__}")

`build_topology` Changes#

Pass frame_count, frame_slots, matchable_offsets from PEConfig to ProcessingElement constructor
Apply initial_frames and initial_tag_store from PEConfig after PE construction
Remove gen_counters application
Remove ctx_slots and offsets parameters from PE constructor call

9. Event Changes (`emu/events.py`)#

Modified Events#

Matched: replace ctx: int with act_id: int, add frame_id: int:

@dataclass(frozen=True)
class Matched:
    time: float
    component: str
    left: int
    right: int
    act_id: int     # was: ctx
    frame_id: int   # new
    offset: int

New Events#

@dataclass(frozen=True)
class FrameAllocated:
    time: float
    component: str
    act_id: int
    frame_id: int

@dataclass(frozen=True)
class FrameFreed:
    time: float
    component: str
    act_id: int
    frame_id: int

@dataclass(frozen=True)
class FrameSlotWritten:
    time: float
    component: str
    frame_id: int
    slot: int
    value: int

@dataclass(frozen=True)
class TokenRejected:
    time: float
    component: str
    token: Token
    reason: str  # e.g. "invalid_act_id", "invalid_iram_page"

Updated Union#

SimEvent = (
    TokenReceived | Matched | Executed | Emitted | IRAMWritten
    | CellWritten | DeferredRead | DeferredSatisfied | ResultSent
    | FrameAllocated | FrameFreed | FrameSlotWritten | TokenRejected
)

`IRAMWritten` Stays#

IRAMWritten remains for IRAM writes via PELocalWriteToken(region=0). The event semantics are the same; only the source token type changes.

10. Monitor/Dfgraph Downstream Impact#

`monitor/snapshot.py`#

PESnapshot changes:

@dataclass(frozen=True)
class PESnapshot:
    pe_id: int
    iram: dict[int, Instruction]          # was: dict[int, ALUInst | SMInst]
    frames: tuple[tuple[int, ...], ...]   # replaces matching_store
    tag_store: dict[int, int]             # replaces gen_counters (act_id -> frame_id)
    presence: tuple[tuple[bool, ...], ...]  # new: per-frame presence bits
    port_store: tuple[tuple[int, ...], ...]  # new: per-frame port metadata
    free_frames: tuple[int, ...]          # new: free frame pool
    input_queue: tuple[Token, ...]
    output_log: tuple[Token, ...]

Remove: matching_store: tuple[tuple[dict, ...], ...], gen_counters: tuple[int, ...].

capture() function: rewrite PE snapshot capture to read pe.frames, pe.tag_store, pe.presence, pe.port_store, pe.free_frames instead of pe.matching_store and pe.gen_counters.

SMSnapshot, SMCellSnapshot, StateSnapshot: unchanged.

`monitor/graph_json.py`#

_serialise_pe_state(): replace matching_store and gen_counters serialisation with frame state serialisation:

def _serialise_pe_state(pe_id: int, snapshot: StateSnapshot) -> dict:
    pe_snap = snapshot.pes.get(pe_id)
    if not pe_snap:
        return {}
    return {
        "iram": ...,  # same as before
        "frames": [...],  # new: per-frame slot contents
        "tag_store": dict(pe_snap.tag_store),  # new
        "presence": [...],  # new
        "free_frames": list(pe_snap.free_frames),  # new
        "input_queue_depth": len(pe_snap.input_queue),
        "output_count": len(pe_snap.output_log),
    }

Remove: matching_store, gen_counters keys from serialised output.

_serialise_event(): update Matched event serialisation to use act_id and frame_id instead of ctx. Add serialisation for new event types (FrameAllocated, FrameFreed, FrameSlotWritten, TokenRejected).

_serialise_node(): remove ctx field from node JSON (or rename to act_id if nodes carry activation information).

graph_to_monitor_json(): update Matched event overlay handler to use event.act_id instead of event.ctx. Add overlay handling for new event types.

11. Dependency Order#

Modules should be changed in this order, based on import dependencies. Each step should leave the codebase in a testable state.

cm_inst.py -- no dependencies. Add Instruction dataclass, Mode enum, FrameOp enum. Rename FREE_CTX to FREE_FRAME. Add EXTRACT_TAG. Keep ALUInst, SMInst, Addr temporarily for backward compatibility.
tokens.py -- imports cm_inst. Rename ctx to act_id in CMToken. Remove gen from DyadToken. Add FrameControlToken, PELocalWriteToken. Remove IRAMWriteToken.
emu/events.py -- imports cm_inst, tokens, sm_mod. Update Matched event (ctx -> act_id, add frame_id). Add new event types. Update SimEvent union.
emu/types.py -- imports cm_inst. Remove MatchEntry. Update PEConfig (remove ctx_slots/offsets/gen_counters, add frame params, change iram type). Keep DeferredRead, SMConfig unchanged.
emu/alu.py -- imports cm_inst. Rename FREE_CTX to FREE_FRAME in match cases. Add EXTRACT_TAG handler if ALU-level.
emu/pe.py -- imports everything above. Full rewrite of ProcessingElement: constructor, pipeline, matching, output routing, side paths. This is the critical change.
emu/sm.py -- imports cm_inst, emu/events, emu/types, sm_mod, tokens. Minimal changes: field name ctx -> act_id in any token construction (but SM doesn't construct tokens directly; it uses replace() on the return route). Essentially no functional change.
emu/network.py -- imports emu/*. Update _target_store() for new token types. Update build_topology() for new PEConfig fields.
emu/__init__.py -- update exports for new event types.
monitor/snapshot.py -- update PESnapshot, capture().
monitor/graph_json.py -- update PE state serialisation, event serialisation, event overlay handlers.

12. What Stays the Same#

The following are explicitly unchanged to prevent scope creep:

SM I-structure semantics (emu/sm.py): cell states, deferred reads, atomic operations, T0/T1 tiers, EXEC.
ALU pure function model (emu/alu.py): execute() signature and dispatch. 16-bit masking. Signed comparison semantics.
SimPy-based simulation: simpy.Environment, simpy.Store, process-per-token model.
Network topology: full mesh, route tables, System.inject(), System.send(), System.load().
SM cell model (sm_mod.py): Presence enum, SMCell dataclass.
Token base class: target: int field.
SMToken: all fields unchanged. ret still holds a CMToken.
Port enum: L = 0, R = 1.
MonadToken.inline: inline monadic token concept unchanged.
DyadToken.wide: wide value concept unchanged.
Event callback mechanism: on_event: EventCallback | None pattern.
Test infrastructure: pytest + hypothesis. Strategies will need updating for new types but the framework is unchanged.
Monitor backend (monitor/backend.py): command/result protocol, threaded architecture. Only the wiring of PEConfig during LoadCmd changes.
Monitor REPL (monitor/repl.py): commands stay the same; output formatting adapts to new snapshot structure.