Reference implementation for the Phoenix Architecture. Work in progress. aicoding.leaflet.pub/
ai coding crazy
at main 121 lines 3.9 kB view raw view rendered
1# Phase B — Canonicalization, Warm Context Hashing & Change Classifier 2 3## Overview 4 5Phase B transforms clauses into a **Canonical Graph** — structured requirement nodes 6(Requirements, Constraints, Invariants, Definitions). It also computes warm context 7hashes that incorporate canonical graph context, and implements the A/B/C/D change classifier. 8 9## Components 10 11### 1. Canonical Node Model (`src/models/canonical.ts`) 12 13```typescript 14enum CanonicalType { 15 REQUIREMENT = 'REQUIREMENT', 16 CONSTRAINT = 'CONSTRAINT', 17 INVARIANT = 'INVARIANT', 18 DEFINITION = 'DEFINITION', 19} 20 21interface CanonicalNode { 22 canon_id: string; // content-addressed 23 type: CanonicalType; 24 statement: string; // normalized canonical statement 25 source_clause_ids: string[]; // provenance back to clauses 26 linked_canon_ids: string[]; // edges to related canonical nodes 27 tags: string[]; // extracted keywords/terms 28} 29``` 30 31### 2. Canonicalization Engine (`src/canonicalizer.ts`) 32 33Extracts canonical nodes from clauses using rule-based extraction: 34 35**Extraction Rules:** 36- Lines containing "must", "shall", "required" → REQUIREMENT 37- Lines containing "must not", "forbidden", "prohibited" → CONSTRAINT 38- Lines containing "always", "never", "invariant" → INVARIANT 39- Lines containing definitions (": ", "is defined as", "means") → DEFINITION 40- Headings containing "constraint", "security", "limit" → CONSTRAINT context 41- Headings containing "requirement" → REQUIREMENT context 42 43**Linking Rules:** 44- Nodes sharing terms/keywords get linked 45- Nodes from same clause get linked 46- Nodes referencing same entities get linked 47 48### 3. Warm Context Hasher (`src/warm-hasher.ts`) 49 50After canonicalization, compute `context_semhash_warm`: 51 52``` 53context_semhash_warm = SHA-256( 54 normalized_text + 55 section_path.join('/') + 56 sorted(linked_canon_ids).join(',') + 57 sorted(canon_node_types).join(',') 58) 59``` 60 61### 4. Change Classifier (`src/classifier.ts`) 62 63Classifies each change into A/B/C/D: 64 65| Class | Meaning | Criteria | 66|-------|---------|----------| 67| A | Trivial | normalized_text identical, only formatting changed | 68| B | Local semantic | clause_semhash changed, context_semhash_cold unchanged | 69| C | Contextual shift | context_semhash changed, canonical links affected | 70| D | Uncertain | classifier confidence below threshold | 71 72**Signals:** 73- `norm_diff`: edit distance of normalized texts 74- `semhash_delta`: binary (same/different clause_semhash) 75- `context_cold_delta`: binary (same/different context_semhash_cold) 76- `term_ref_delta`: Jaccard distance of extracted terms 77- `section_structure_delta`: section_path changed? 78- `canon_impact`: number of affected canonical nodes 79 80### 5. D-Rate Tracker (`src/d-rate.ts`) 81 82Tracks D-classification rate over a rolling window. 83 84- Target: ≤5% 85- Acceptable: ≤10% 86- Alarm: >15% 87 88### 6. Bootstrap State Machine (`src/bootstrap.ts`) 89 90States: `BOOTSTRAP_COLD``BOOTSTRAP_WARMING``STEADY_STATE` 91 92Transitions: 93- COLD → WARMING: after first canonicalization + warm pass complete 94- WARMING → STEADY_STATE: after D-rate stabilizes below acceptable threshold 95 96## Data Flow 97 98``` 99Clauses (Phase A) 100 → Canonicalizer.extract() → CanonicalNode[] 101 → WarmHasher.computeWarm() → Clause[] (with warm hashes) 102 → Classifier.classify() → ChangeClassification[] 103 → DRateTracker.record() → DRateStatus 104 → BootstrapState.transition() 105``` 106 107## File Layout (Phase B additions) 108 109``` 110src/ 111 models/ 112 canonical.ts # CanonicalNode interface + types 113 classification.ts # ChangeClass enum + ChangeClassification 114 canonicalizer.ts # Clause → CanonicalNode extraction 115 warm-hasher.ts # Warm context hash computation 116 classifier.ts # A/B/C/D change classifier 117 d-rate.ts # D-rate tracking 118 bootstrap.ts # Bootstrap state machine 119 store/ 120 canonical-store.ts # Canonical graph persistence 121```