Reference implementation for the Phoenix Architecture. Work in progress. aicoding.leaflet.pub/
ai coding crazy
at main 231 lines 9.9 kB view raw view rendered
1# Phoenix: Version Control for Intent, Not Diffs 2 3**TL;DR:** We built a version control system that operates on *what you mean*, not *what changed in the file*. Change one line in your spec and Phoenix knows exactly which requirements are affected, which code needs regeneration, and which downstream modules need re-validation — without touching anything else. 4 5[GitHub](https://github.com/phoenix-vcs/phoenix) | [Demo](#running-the-demo) | [Docs](docs/MANUAL.md) 6 7--- 8 9## The Problem 10 11Every version control system since diff was invented in 1974 operates on the same primitive: **line-level text changes**. Git is brilliant at tracking *what* changed. It has no idea *why* it matters. 12 13When you change "bcrypt" to "argon2id" on line 10 of your auth spec, git sees one modified line. But that one line: 14 15- Changes a **security requirement** (passwords must be hashed with argon2id) 16- Affects the **auth module** (which implements that requirement) 17- Invalidates the **generated code** (which uses bcrypt) 18- Breaks a **boundary policy** (if argon2id isn't in the allowed packages list) 19- Requires new **evidence** (unit tests, security review) 20- Potentially **cascades** to every module that depends on the auth module 21 22Git knows none of this. Your team discovers it through code review, broken builds, and production incidents. 23 24## The Idea 25 26What if version control understood **intent**? 27 28Not "line 10 changed" — but "the password hashing requirement changed, which affects AuthIU, which needs new evidence, and SessionIU depends on it so re-validate that too." 29 30Phoenix is a **causal compiler for intent**. It compiles: 31 32``` 33Spec Line → Clause → Canonical Requirement → Implementation Unit → Generated Code → Evidence → Policy Decision 34``` 35 36Every arrow is a traceable provenance edge. Every transformation is content-addressed and deterministic. 37 38## How It Works (5 Minutes) 39 40### 1. You write a Markdown spec 41 42```markdown 43## Requirements 44 45- Users must authenticate with email and password 46- Sessions expire after 24 hours 47- Passwords must be hashed with bcrypt (cost factor 12) 48 49## Security Constraints 50 51- All endpoints must use HTTPS 52- Tokens must be signed with RS256 53``` 54 55### 2. Phoenix parses it into clauses 56 57Each heading + body becomes a **clause** — the atomic unit of your spec. Clauses are normalized (lowercased, list items sorted, formatting stripped) and SHA-256 hashed. This means: 58 59- Reordering bullet points doesn't change the hash 60- Adding bold markers doesn't change the hash 61- Actual semantic changes always change the hash 62 63### 3. Phoenix extracts canonical requirements 64 65Pattern matching identifies structured requirements: 66 67``` 68REQUIREMENT: "users must authenticate with email and password" 69REQUIREMENT: "passwords must be hashed with bcrypt (cost factor 12)" 70CONSTRAINT: "all endpoints must use HTTPS" 71``` 72 73Nodes are linked by shared terms. "Passwords must be hashed" links to "Password reset tokens expire" through the shared term "password." 74 75### 4. Phoenix maps requirements to code 76 77Canonical nodes are grouped into **Implementation Units** — stable compilation boundaries with contracts, boundary policies, and evidence requirements. 78 79``` 80RequirementsIU (high risk) 81 → 8 canonical nodes 82 → output: src/generated/requirementsiu.ts 83 → evidence required: typecheck, lint, boundary, unit tests, property tests, static analysis 84``` 85 86### 5. Phoenix generates code and tracks it 87 88The regeneration engine produces code stubs (or, in production, invokes an LLM). Every generated file is content-hashed into a **manifest**. 89 90Edit a generated file without permission? **Drift detected.** Phoenix blocks acceptance until you label the edit as a promoted requirement, a signed waiver, or a temporary patch. 91 92### 6. Phoenix enforces boundaries 93 94Each IU declares what it's allowed to import and what side channels (databases, APIs, env vars) it may use: 95 96``` 97import axios from 'axios'; → ERROR: forbidden package 98process.env.UNDECLARED_SECRET; → WARNING: undeclared config 99``` 100 101### 7. Phoenix propagates failures 102 103When the auth module fails its type check, Phoenix doesn't just flag the auth module. It walks the dependency graph and marks every dependent module for re-validation: 104 105``` 106AuthIU [FAIL: typecheck] → BLOCK 107 └─ SessionIU → RE_VALIDATE 108 └─ ApiIU → RE_VALIDATE 109``` 110 111## The Key Insight: Selective Invalidation 112 113Change "bcrypt" to "argon2id" on one line. Phoenix: 114 1151. Detects the **Requirements clause** was modified 1162. Identifies **4 canonical nodes** affected 1173. Classifies this as a **C — Contextual Shift** (90% confidence, 9 canon nodes impacted) 1184. Marks the **RequirementsIU** for regeneration 1195. Checks the **boundary policy** (is argon2id in allowed_packages?) 1206. Invalidates **evidence** (tests need re-running) 1217. Cascades to **dependent IUs** 122 123Everything not in that subtree? Untouched. The login endpoint clause is UNCHANGED (class A). The logout clause is UNCHANGED. Only the affected subtree is reprocessed. 124 125This is the difference between "rebuild the world" and "rebuild what matters." 126 127## What We Built 128 129Phoenix is implemented in TypeScript with zero runtime dependencies beyond Node.js crypto. The codebase is ~3,000 lines of source across 30 modules, covered by 200+ tests. 130 131### Architecture 132 133| Phase | What it does | 134|-------|-------------| 135| **A** | Clause extraction + semantic hashing | 136| **B** | Canonicalization + warm hashing + A/B/C/D classifier | 137| **C1** | IU planning + code generation + manifest + drift detection | 138| **C2** | Boundary validation (architectural linter) | 139| **D** | Evidence + policy engine + cascading failures | 140| **E** | Shadow pipeline upgrades + storage compaction | 141| **F** | Bot interface (SpecBot, ImplBot, PolicyBot) | 142 143### The Trust Dashboard 144 145Everything feeds into `phoenix status`: 146 147``` 148phoenix status STEADY_STATE | spec/auth.md v1 → v2 149 150Classification Summary A:3 B:1 C:4 D:0 │ D-Rate: 0.0% TARGET 151 152Canonical Graph 8 → 10 nodes │ +6 new -4 removed 4 kept 153 154Implementation Units 1 IU │ 1 generated files 155 156Drift 1 DRIFTED │ 0 clean 1 drifted 157 158Boundary 2 errors 2 warnings 159 160Actions Required: 161 ERROR drift requirementsiu.ts Drifted → label or reconcile 162 ERROR boundary axios Forbidden package → remove import 163 WARN boundary STRIPE_API_KEY Undeclared config → declare or remove 164``` 165 166If this dashboard is trusted, Phoenix becomes the coordination substrate for your entire development process. 167 168If it's noisy or wrong, the system dies. 169 170**Trust > Cleverness.** 171 172## Running the Demo 173 174```bash 175git clone https://github.com/phoenix-vcs/phoenix 176cd phoenix 177npm install 178npx tsx demo.ts 179``` 180 181The demo is a 23-step walkthrough that shows every file, every data structure, and every transformation. You'll see: 182 183- Your spec file color-coded by clause 184- Raw vs normalized text with proof that formatting doesn't affect hashes 185- Full JSON clause objects with content-addressed IDs 186- Canonical requirement extraction with provenance chains 187- Cold vs warm hash comparison 188- Bootstrap state machine transitions 189- Side-by-side spec v1 → v2 with clause-level diffs 190- A/B/C/D classification with signal breakdowns 191- Generated TypeScript code with manifest entries 192- Drift detection catching a simulated manual edit 193- Boundary validation catching forbidden imports and undeclared side channels 194- Evidence evaluation lifecycle (INCOMPLETE → PASS → FAIL) 195- Cascading failures through a dependency graph 196- Shadow pipeline upgrade classification 197- Storage compaction preserving critical data 198- Bot command parsing with confirmation model 199 200## What's Next 201 202Phoenix is alpha. The canonicalization engine uses rule-based pattern matching. In production, this would be a versioned LLM pipeline — which is why the shadow pipeline upgrade mechanism exists from day one. 203 204The code generator produces stubs. In production, this would invoke an LLM with structured promptpacks, using the IU contract, canonical requirements, and boundary policy as context. 205 206We're looking for: 207- **Early adopters** willing to try Phoenix on greenfield TypeScript projects 208- **Contributors** interested in the canonicalization pipeline, boundary validation, and evidence engine 209- **Feedback** on the trust model — does `phoenix status` give you the confidence to rely on it? 210 211## FAQ 212 213**Q: Is this "AI that writes code"?** 214No. Phoenix is a *causal compiler for intent*. The code generation is one step in a pipeline that starts with structured specs and ends with provenance-tracked, boundary-validated, evidence-certified modules. The AI is a tool; the system is the value. 215 216**Q: Does it work with existing codebases?** 217v1 is greenfield-first. Brownfield progressive wrapping is designed (wrap existing module → define boundary → write minimal spec → enforce without full regen) but not the primary path. 218 219**Q: How is this different from Copilot/Cursor/etc?** 220Those tools help you write code faster. Phoenix ensures the code you write (or generate) is **traceable to requirements, boundary-validated, evidence-certified, and selectively invalidated when specs change.** They're complementary — you could use Copilot inside Phoenix's regeneration engine. 221 222**Q: What if the classifier is wrong?** 223That's what the D-rate is for. If uncertain classifications exceed 15%, Phoenix raises an alarm, increases override friction, and surfaces the issue in status. The system is designed to degrade gracefully, not silently. 224 225**Q: Why TypeScript?** 226It's the reference implementation. The architecture is language-agnostic — the spec graph, canonical graph, and provenance graph don't care what language the generated code is in. 227 228--- 229 230*Phoenix VCS — Regenerative Version Control* 231*[GitHub](https://github.com/phoenix-vcs/phoenix) | [Manual](docs/MANUAL.md) | [Demo: `npx tsx demo.ts`]*