Reference implementation for the Phoenix Architecture. Work in progress.
aicoding.leaflet.pub/
ai
coding
crazy
1# Phase B — Canonicalization, Warm Context Hashing & Change Classifier
2
3## Overview
4
5Phase B transforms clauses into a **Canonical Graph** — structured requirement nodes
6(Requirements, Constraints, Invariants, Definitions). It also computes warm context
7hashes that incorporate canonical graph context, and implements the A/B/C/D change classifier.
8
9## Components
10
11### 1. Canonical Node Model (`src/models/canonical.ts`)
12
13```typescript
14enum CanonicalType {
15 REQUIREMENT = 'REQUIREMENT',
16 CONSTRAINT = 'CONSTRAINT',
17 INVARIANT = 'INVARIANT',
18 DEFINITION = 'DEFINITION',
19}
20
21interface CanonicalNode {
22 canon_id: string; // content-addressed
23 type: CanonicalType;
24 statement: string; // normalized canonical statement
25 source_clause_ids: string[]; // provenance back to clauses
26 linked_canon_ids: string[]; // edges to related canonical nodes
27 tags: string[]; // extracted keywords/terms
28}
29```
30
31### 2. Canonicalization Engine (`src/canonicalizer.ts`)
32
33Extracts canonical nodes from clauses using rule-based extraction:
34
35**Extraction Rules:**
36- Lines containing "must", "shall", "required" → REQUIREMENT
37- Lines containing "must not", "forbidden", "prohibited" → CONSTRAINT
38- Lines containing "always", "never", "invariant" → INVARIANT
39- Lines containing definitions (": ", "is defined as", "means") → DEFINITION
40- Headings containing "constraint", "security", "limit" → CONSTRAINT context
41- Headings containing "requirement" → REQUIREMENT context
42
43**Linking Rules:**
44- Nodes sharing terms/keywords get linked
45- Nodes from same clause get linked
46- Nodes referencing same entities get linked
47
48### 3. Warm Context Hasher (`src/warm-hasher.ts`)
49
50After canonicalization, compute `context_semhash_warm`:
51
52```
53context_semhash_warm = SHA-256(
54 normalized_text +
55 section_path.join('/') +
56 sorted(linked_canon_ids).join(',') +
57 sorted(canon_node_types).join(',')
58)
59```
60
61### 4. Change Classifier (`src/classifier.ts`)
62
63Classifies each change into A/B/C/D:
64
65| Class | Meaning | Criteria |
66|-------|---------|----------|
67| A | Trivial | normalized_text identical, only formatting changed |
68| B | Local semantic | clause_semhash changed, context_semhash_cold unchanged |
69| C | Contextual shift | context_semhash changed, canonical links affected |
70| D | Uncertain | classifier confidence below threshold |
71
72**Signals:**
73- `norm_diff`: edit distance of normalized texts
74- `semhash_delta`: binary (same/different clause_semhash)
75- `context_cold_delta`: binary (same/different context_semhash_cold)
76- `term_ref_delta`: Jaccard distance of extracted terms
77- `section_structure_delta`: section_path changed?
78- `canon_impact`: number of affected canonical nodes
79
80### 5. D-Rate Tracker (`src/d-rate.ts`)
81
82Tracks D-classification rate over a rolling window.
83
84- Target: ≤5%
85- Acceptable: ≤10%
86- Alarm: >15%
87
88### 6. Bootstrap State Machine (`src/bootstrap.ts`)
89
90States: `BOOTSTRAP_COLD` → `BOOTSTRAP_WARMING` → `STEADY_STATE`
91
92Transitions:
93- COLD → WARMING: after first canonicalization + warm pass complete
94- WARMING → STEADY_STATE: after D-rate stabilizes below acceptable threshold
95
96## Data Flow
97
98```
99Clauses (Phase A)
100 → Canonicalizer.extract() → CanonicalNode[]
101 → WarmHasher.computeWarm() → Clause[] (with warm hashes)
102 → Classifier.classify() → ChangeClassification[]
103 → DRateTracker.record() → DRateStatus
104 → BootstrapState.transition()
105```
106
107## File Layout (Phase B additions)
108
109```
110src/
111 models/
112 canonical.ts # CanonicalNode interface + types
113 classification.ts # ChangeClass enum + ChangeClassification
114 canonicalizer.ts # Clause → CanonicalNode extraction
115 warm-hasher.ts # Warm context hash computation
116 classifier.ts # A/B/C/D change classifier
117 d-rate.ts # D-rate tracking
118 bootstrap.ts # Bootstrap state machine
119 store/
120 canonical-store.ts # Canonical graph persistence
121```