Macro Enhancements: Opcode Parameters, Qualified Ref Parameters, and @ret Wiring#
Extends the dfasm macro system with three capabilities that reduce the need for per-variant macro definitions and make macros composable in the same way as functions.
Current State#
The macro system (implemented in asm/expand.py, grammar in dfasm.lark) supports:
- Parameter substitution in node names via
${param}(token pasting with prefix/suffix) - Parameter substitution in edge endpoints via
${param}inqualified_ref - Parameter substitution in const fields
- Compile-time arithmetic via
ConstExpr(${base} + ${_idx} + 1) - Variadic parameters with
@eachrepetition blocks - Nested macro invocation (depth limit 32)
Three gaps remain:
-
Opcode position is not parameterizable. The grammar defines
opcode: OPCODEas a keyword terminal. You cannot pass an opcode as a macro argument. This forces per-opcode variants:#reduce_add_2,#reduce_add_3, etc. -
Placement and port qualifiers are not parameterizable. The grammar defines
placement: "|" IDENTandport: ":" PORT_SPEC— neither acceptsparam_ref. You cannot write&ref:${port}or&ref|${pe}in a macro body to parameterize which port or PE a reference targets. -
Macros have no output wiring convention. Functions use
@ret/@ret_namemarkers in their body, and the call syntax$func args |> outputsauto-wires return paths. Macros have no equivalent — the user must manually wire to expanded internal node names after invocation.
Enhancement 1: Opcode Parameters#
Goal#
Allow macro parameters to appear in the opcode position of inst_def, strong_edge, and weak_edge rules.
Grammar Change#
// Current:
opcode: OPCODE
// Proposed:
opcode: OPCODE | param_ref
This is the only grammar change needed. param_ref (${name}) is already a valid production. Earley parsing handles the ambiguity.
Lower Pass#
The inst_def handler in lower.py currently calls self._resolve_opcode() which maps mnemonic strings to ALUOp/MemOp values. When the opcode is a ParamRef, lowering must defer resolution — store the ParamRef on the IRNode in a new field (or overload the opcode field's type to Union[ALUOp, MemOp, ParamRef]).
The strong_edge and weak_edge handlers need the same treatment: if the opcode token is a ParamRef, create the anonymous node with a deferred opcode.
Expand Pass#
During _clone_and_substitute_node, if node.opcode is a ParamRef:
- Look up the parameter in the substitution map
- The argument value must be a string matching a known opcode mnemonic
- Resolve via
MNEMONIC_TO_OPto get the concreteALUOp/MemOp - Replace the node's opcode with the resolved value
- Error if the argument is not a valid opcode mnemonic
Validation#
Opcode validation (monadic/dyadic arity, valid argument combinations) already happens after expansion in the resolve and allocate passes. No additional validation needed at expansion time beyond confirming the mnemonic exists.
Example#
Before (current — per-opcode variants):
#reduce_add_2 |> { &r <| add }
#reduce_add_3 |> { &r0 <| add; &r1 <| add; &r0 |> &r1:L }
#reduce_sub_2 |> { &r <| sub }
; ... N variants per opcode
After (parameterized):
#reduce_2 op |> {
&r <| ${op}
}
#reduce_3 op |> {
&r0 <| ${op}
&r1 <| ${op}
&r0 |> &r1:L
}
; Usage:
#reduce_2 add
#reduce_3 sub
Argument Syntax#
Opcode arguments are passed as bare identifiers in the macro call. The grammar for macro_call_stmt already accepts argument which includes qualified_ref, and a bare IDENT would normally parse as... hmm, actually it won't. An unqualified add in argument position parses as the OPCODE terminal (priority 2), not as IDENT. And OPCODE is not a valid argument.
Two options:
Option A: Quote opcode arguments. Pass as string literals: #reduce_2 "add". Simple, unambiguous. Expand pass strips quotes and resolves. Slightly ugly.
Option B: Accept OPCODE as a macro argument. Add OPCODE as an alternative in positional_arg:
// Current:
?positional_arg: value | qualified_ref
// Proposed:
?positional_arg: value | qualified_ref | OPCODE
The lower pass wraps the bare opcode token as a string argument in the IRMacroCall. Expand resolves it against MNEMONIC_TO_OP. This reads naturally: #reduce_2 add.
Option B is cleaner. The only risk is if someone has an IDENT that collides with an opcode name as a label/node, but the priority system already handles that (opcodes win at lexer level), and this collision already exists in the language.
Recommendation: Option B.
Enhancement 2: Parameterized Placement and Port Qualifiers#
Goal#
Allow ${param} in the placement (|pe0) and port (:L) positions of a qualified_ref, so macros can parameterize which PE a node targets, which port an edge uses, and (when exposed) which context slot to use.
Current State#
qualified_ref is built from three parts:
qualified_ref: (node_ref | label_ref | ... | param_ref) placement? port?
placement: "|" IDENT
port: ":" PORT_SPEC
PORT_SPEC: IDENT | HEX_LIT | DEC_LIT
${param} can already stand in for the entire ref part (the first element). But the placement and port suffixes only accept literal tokens. So &node:${port} and &node|${pe} don't parse.
In the lower pass, qualified_ref collects its children into a dict:
- The ref part becomes
{"name": ...} placementreturns a string (e.g.,"pe0")portreturns aPortenum (Port.L,Port.R) or rawint
In the IR, IRNode.pe stores placement as Optional[int], and IREdge.port/IREdge.source_port store port as Port. Neither field currently accepts ParamRef.
The expand pass (_clone_and_substitute_node, _clone_and_substitute_edge) only substitutes name, const, source, and dest. It does not touch pe, port, or source_port.
Grammar Changes#
// Current:
placement: "|" IDENT
port: ":" PORT_SPEC
// Proposed:
placement: "|" (IDENT | param_ref)
port: ":" (PORT_SPEC | param_ref)
Lower Pass#
The placement handler currently does return str(token). It needs to handle receiving a ParamRef from the parser and return it as-is:
def placement(self, *args):
for arg in args:
if isinstance(arg, ParamRef):
return arg
return str(args[-1])
Similarly, the port handler needs to pass through ParamRef instead of resolving to Port:
def port(self, *args):
for arg in args:
if isinstance(arg, ParamRef):
return arg
# ... existing Port.L / Port.R / int resolution
The qualified_ref handler already iterates over args by type. It needs a new branch to detect ParamRef in placement/port positions (currently it only detects ParamRef in the ref-name position). The disambiguation is based on ordering: the ref-name comes first, placement second (prefixed with |), port third (prefixed with :). Since Lark processes them through their respective rules before qualified_ref sees them, the parser distinguishes them. The qualified_ref handler just needs to accept ParamRef for placement and port:
def qualified_ref(self, *args):
ref_type = None
placement = None
port = None
for arg in args:
if isinstance(arg, ParamRef) and ref_type is None:
ref_type = {"name": arg}
elif isinstance(arg, ParamRef) and ref_type is not None:
# Second or third ParamRef — depends on position
# But Lark gives us placement/port through their handlers,
# so we get ParamRef from the placement() or port() handler.
# Need to distinguish: placement handler adds a marker or
# we rely on Lark's rule names.
...
Actually, this is simpler than it looks. Lark calls placement() and port() before qualified_ref(). So qualified_ref receives:
- A dict or
ParamRef(from the ref-name rules) - A string or
ParamRef(from theplacementhandler) - A
Port/intorParamRef(from theporthandler)
The existing type-based dispatch in qualified_ref needs one addition: if an arg is ParamRef and ref_type is already set, it's either placement or port. We can distinguish by wrapping them — the placement handler returns ("placement", ParamRef(...)) and port returns ("port", ParamRef(...)) when deferring. Or simpler: use a thin wrapper type.
Alternatively, Lark's @v_args(inline=True) on placement/port means the handler already knows which rule matched. The cleanest approach: return a ParamRef tagged with its role:
@dataclass(frozen=True)
class PlacementRef:
"""Deferred placement from macro parameter."""
param: ParamRef
@dataclass(frozen=True)
class PortRef:
"""Deferred port from macro parameter."""
param: ParamRef
Then qualified_ref type-dispatches on PlacementRef/PortRef alongside str/Port/int.
IR Changes#
IRNode.pe type becomes Optional[Union[int, ParamRef]].
IREdge.port type becomes Union[Port, ParamRef].
IREdge.source_port type becomes Optional[Union[Port, ParamRef]].
These wider types only appear in macro template bodies. After expansion, all ParamRef values are resolved to concrete types. The resolve, place, and allocate passes never see ParamRef — if one leaks through, it's a bug in expand.
Expand Pass#
_clone_and_substitute_node gains:
# Substitute PE placement if it's a ParamRef
new_pe = node.pe
if isinstance(new_pe, ParamRef):
resolved = _substitute_param(new_pe, subst_map)
# Must resolve to a PE identifier string like "pe0" or an int
new_pe = _resolve_pe_placement(resolved) # parse "pe0" -> 0, or int -> int
_clone_and_substitute_edge gains:
# Substitute port if it's a ParamRef
new_port = edge.port
if isinstance(new_port, ParamRef):
resolved = _substitute_param(new_port, subst_map)
new_port = _resolve_port(resolved) # "L" -> Port.L, "R" -> Port.R, int -> int
new_source_port = edge.source_port
if isinstance(new_source_port, ParamRef):
resolved = _substitute_param(new_source_port, subst_map)
new_source_port = _resolve_port(resolved)
Validation#
Invalid port/placement values (e.g., passing "banana" as a port) produce a MACRO error during expansion. Post-expansion, the existing place and allocate passes validate that PE IDs are in range and ports are valid.
Examples#
Parameterized port selection:
; Macro that wires to a caller-selected port
#wire_to_port target, port |> {
&src <| pass
&src |> ${target}:${port}
}
; Usage: wire to left port
#wire_to_port &dest, L
; Usage: wire to right port
#wire_to_port &dest, R
Parameterized PE placement:
; Macro that places its node on a specific PE
#placed_const val, pe |> {
&c <| const, ${val} |${pe}
&c |> @ret
}
; Usage: place on pe0
#placed_const 42, pe0 |> &target
; Usage: place on pe1
#placed_const 42, pe1 |> &target
Combined — a macro that builds a cross-PE relay:
; Route a value from one PE to another
#cross_pe_relay src_pe, dst_pe |> {
&hop <| pass |${src_pe}
&hop |> @ret
}
; Usage:
#cross_pe_relay pe0, pe1 |> &destination
Context Slot Syntax#
Context slots use bracket syntax [N], distinct from all other qualifiers:
&node|pe0[2] ; place on pe0, context slot 2
&node[0] ; context slot 0, auto-placed PE
&node|pe1[0..4] ; reserve context slots 0-4 for this instruction
The bracket syntax avoids overloading : (which already carries port, cell address, and potentially IRAM address semantics). [N] is exclusively context slots.
Grammar#
// New production:
ctx_slot: "[" (DEC_LIT | ctx_range | param_ref) "]"
ctx_range: DEC_LIT ".." DEC_LIT
// Updated qualified_ref:
qualified_ref: (node_ref | label_ref | ... | param_ref) placement? ctx_slot? port?
ctx_slot appears between placement and port in the qualifier chain: &node|pe0[2]:L.
Use Cases#
- Explicit context partitioning: place parallel computations in distinct context slots to avoid matching store collisions
- Debugging: force a known context layout for inspection
- Range reservation (
[0..4]): reserve a contiguous block of slots for an instruction that will be targeted by multiple parallel sources wired identically — not essential but a natural extension
Parameterization#
Same mechanism as placement/port. [${ctx}] in a macro body, substituted to an integer during expansion:
#placed_op op, pe, ctx |> {
&n <| ${op} |${pe}[${ctx}]
&n |> @ret
}
; Usage:
#placed_op add, pe0, 2 |> &target
Enhancement 3: @ret Wiring for Macros#
Goal#
Allow macros to define output points using @ret / @ret_name markers, and wire them to destinations at the call site using the |> syntax.
Grammar Change#
Add optional output list to macro_call_stmt:
// Current:
macro_call_stmt: "#" IDENT (argument ("," argument)*)?
// Proposed:
macro_call_stmt: "#" IDENT (argument ("," argument)*)? (FLOW_OUT call_output_list)?
This reuses the existing call_output_list and call_output productions from call_stmt. Same syntax: #macro args |> &dest or #macro args |> name=&dest.
Macro Body Convention#
Macro bodies use @ret and @ret_name in edge destinations, same as function bodies:
#loop op, init_val |> {
&counter <| add
&compare <| ${op}
&counter |> &compare:L
&inc <| inc
&compare |> &inc:L
&inc |> &counter:R
; Output edges use @ret convention
&compare |> @ret_body
&compare |> @ret_exit:R
}
Lower Pass#
When lowering macro_call_stmt with a FLOW_OUT and call_output_list:
- Parse the output list the same way
call_stmtdoes (named/positional outputs) - Store output destinations on the
IRMacroCallin a new field:output_dests: tuple
The IRMacroCall dataclass gains:
@dataclass(frozen=True)
class IRMacroCall:
name: str
positional_args: tuple
named_args: tuple
output_dests: tuple = () # New: output wiring destinations
loc: Optional[SourceLoc] = None
Expand Pass#
After cloning and substituting the macro body, process @ret markers:
- Scan expanded edges for destinations starting with
@ret - For each
@ret/@ret_namedestination, look up the corresponding output fromIRMacroCall.output_dests - Replace the
@ret*destination with the actual target node name - If a
@ret*marker has no matching output dest, report a MACRO error
This is simpler than function call wiring because macros don't need:
- Trampoline nodes (no cross-context routing)
ctx_overrideedges (macros inline into the caller's context)FREE_CTXnodes (no context allocation)- Synthetic PASS nodes (direct edge replacement suffices)
The @ret substitution in macros is purely edge rewriting — replace the symbolic @ret_name destination with the concrete node reference from the call site.
Positional @ret Mapping#
Same convention as function calls:
- Bare
@retmaps to the first (or only) positional output @ret_namemaps to the named outputname=&dest- Multiple bare
@retedges to different ports on the same output are valid
Example#
; Define macro with outputs
#loop_counted |> {
&counter <| add
&compare <| brgt
&counter |> &compare:L
&inc <| inc
&compare |> &inc:L
&inc |> &counter:R
&compare |> @ret_body
&compare |> @ret_exit:R
}
; Invoke with output wiring
#loop_counted |> body=&process, exit=&done
&init |> #loop_counted_0.&counter:L
&limit |> #loop_counted_0.&compare:R
Or positionally:
#simple_gate |> {
&g <| gate
&g |> @ret
&g |> @ret:R ; second output port
}
; Invoke — positional @ret maps to first output
#simple_gate |> &body, &exit
Impact on Built-in Macros#
With both enhancements, the built-in library collapses significantly:
Current (11 macros)#
#loop_counted, #loop_while
#permit_inject_1, #permit_inject_2, #permit_inject_3, #permit_inject_4
#reduce_add_2, #reduce_add_3, #reduce_add_4
Proposed (4-5 macros, more capable)#
; Counted loop with output wiring
#loop_counted |> {
&counter <| add
&compare <| brgt
&counter |> &compare:L
&inc <| inc
&compare |> &inc:L
&inc |> &counter:R
&compare |> @ret_body
&compare |> @ret_exit:R
}
; Condition-tested loop
#loop_while |> {
&gate <| gate
&gate |> @ret_body
&gate |> @ret_exit:R
}
; Permit injection — variadic, outputs via @ret
#permit_inject *nodes |> {
$(
&p_${_idx} <| const, 1
&p_${_idx} |> @ret
),*
}
; Binary reduction tree — parameterized opcode + arity
#reduce_2 op |> {
&r <| ${op}
}
#reduce_3 op |> {
&r0 <| ${op}
&r1 <| ${op}
&r0 |> &r1:L
}
#reduce_4 op |> {
&r0 <| ${op}
&r1 <| ${op}
&r2 <| ${op}
&r0 |> &r2:L
&r1 |> &r2:R
}
Usage:
; Old:
!#loop_counted
&init |> #loop_counted_0.&counter:L
&limit |> #loop_counted_0.&compare:R
#loop_counted_0.&compare |> &body:L
#loop_counted_0.&compare |> &exit:R
; New:
#loop_counted |> body=&process, exit=&done
&init |> #loop_counted_0.&counter:L
&limit |> #loop_counted_0.&compare:R
; Old:
!#reduce_add_4
; New:
#reduce_4 add
Note: the #permit_inject example with variadic @ret is aspirational — it requires @ret to work inside repetition blocks, which means the @ret substitution must happen after repetition expansion. This ordering is already correct since repetition expansion happens before edge rewriting in the expand pass.
Implementation Order#
-
Opcode parameters — grammar change (
opcode: OPCODE | param_ref), argument syntax (positional_arg: ... | OPCODE), expand pass substitution. Smallest diff, immediately useful. -
Qualified ref parameters — grammar changes to
placementandport,PlacementRef/PortRefwrapper types, IR type widening, expand pass substitution. Mechanically similar to opcode params, builds on the same_substitute_paraminfrastructure. -
@ret wiring for macros — grammar change (output list on
macro_call_stmt),IRMacroCall.output_dests, expand pass edge rewriting. Builds on existing@retpatterns from function calls. -
Built-in macro rewrite — collapse per-variant macros using the new features. Backwards-incompatible (old macro names removed), but since the built-ins are bundled and the system is pre-1.0, this is acceptable.
Open Questions#
-
Should macros with
@retalso support|>on inputs? Function calls use$func a=&x |> @output. Currently macro calls use#macro arg1, arg2for inputs. Adding|>for outputs is proposed above. Should inputs also support named wiring? Probably not needed — macros already have${param}for inputs, and the input wiring is fundamentally different (parameter substitution vs edge creation). -
Error messages for mismatched @ret counts. If a macro body has
@ret_bodyand@ret_exitbut the call site only provides one output, what error? Probably MACRO category: "macro '#loop_counted' defines outputs @ret_body, @ret_exit but call provides 1 output". -
Interaction with nested macros. If macro A calls macro B which has
@ret, and A also has@ret, the scoping should work naturally — B's@retresolves at B's call site (inside A's body), A's@retresolves at A's call site. The existing scope qualification prevents name collisions.