# Phoenix VCS — Project Audit Report

**Date:** 2026-02-17  
**Scope:** Phases A, B, C1, C2  
**Lines of code:** ~2,450 source, ~1,800 test (4,258 total)  
**Tests:** 142 passing across 17 test files (14 unit + 3 functional)

---

## ✅ What's Working Well

1. **Clean architecture** — Models, logic, and storage are well-separated. Models are pure types, logic is pure functions (easy to test), stores handle persistence.

2. **Content-addressed design** — Every object (clause, canonical node, IU) is identified by a hash of its content. This is sound and will scale well.

3. **Test coverage** — Every module has unit tests. Three functional tests validate end-to-end pipelines. All 142 pass in ~110ms.

4. **TypeScript strict mode** — `strict: true` enabled, compiles cleanly with no suppressions in source code.

5. **Provenance chain** — The traceability from spec lines → clauses → canonical nodes → IUs → generated files → boundary validation is fully connected.

---

## 🔧 Issues Fixed During Audit

| # | Issue | Severity | Fix |
|---|-------|----------|-----|
| 1 | **Duplicate boundary diagnostics** — When a package was both forbidden and not in the allowlist, two diagnostics were emitted for the same import. | Medium | Changed to `else if` so forbidden check takes priority. |
| 2 | **Dead code in classifier** — The D-class branch inside the canon-impact block was unreachable (confidence was always ≥ 0.7, threshold was < 0.6). | Low | Removed dead branch. |
| 3 | **`as any` in tests** — Two test lines used `{} as any` for signal objects. | Low | Replaced with properly typed empty signal objects. |

---

## ⚠️ Issues to Address (Not Yet Fixed)

### High Priority

**H1. No provenance graph persistence**  
The PRD specifies a Provenance Graph (Section 2) that records all transformation edges. Currently, provenance is implicit (canonical nodes have `source_clause_ids`, IUs have `source_canon_ids`), but there's no unified provenance store. Every transformation should emit a provenance edge to a dedicated graph.

**H2. Normalizer doesn't handle code blocks**  
Fenced code blocks (` ``` `) are currently processed like regular text — headings and list items inside code blocks get mangled. The parser should skip code block contents during normalization.

**H3. No pre-heading content handling**  
If a spec file has content before the first heading (e.g., a preamble), it's silently discarded by the parser. Only heading-bounded sections are captured. The PRD doesn't explicitly address this, but losing content is wrong.

**H4. Classifier D-class is hard to trigger**  
The current classification logic produces D (uncertain) only when `norm_diff > 0.7 || term_ref_delta > 0.7` AND no canonical impact AND no context shift. This is a very narrow window. The D-rate mechanism needs real exercise.

### Medium Priority

**M1. IU planner grouping is greedy**  
`clusterNodes()` uses BFS to group all transitively connected nodes. In a large spec, this could collapse too many unrelated requirements into a single giant IU because of loose term overlap chains (A links to B links to C...). Should add a max-cluster-size or minimum-link-weight threshold.

**M2. Regeneration is stub-only**  
The regen engine only produces function stubs. This is expected for v1, but the stub quality is minimal — no imports, no types, no contract enforcement in the generated code. The stubs should at least generate TypeScript interfaces from the IU contract.

**M3. Manifest doesn't track deleted files**  
If a file is removed from `output_files` between regenerations, the old manifest entry persists. Need a reconciliation step that detects orphaned manifest entries.

**M4. Content store has no garbage collection**  
Objects are never deleted. After multiple ingestions, stale clause objects accumulate. Need either reference counting or mark-and-sweep relative to the current graph indices.

**M5. Side channel detection is shallow**  
The dep-extractor uses regex patterns. It misses indirect patterns like `const { env } = process; env.SECRET`, dynamic imports, and aliased require calls. Acceptable for v1 but should move to AST-based extraction.

**M6. Spec parser doesn't handle ATX heading edge cases**  
Lines like `# ` (heading marker with no text), `##text` (no space), or setext-style headings (`Title\n====`) are not handled.

### Low Priority

**L1. No .gitignore**  
The project is missing a `.gitignore` for `node_modules/`, `dist/`, and temp `.phoenix/` directories.

**L2. Demo creates temp directories without cleanup**  
`mkdtempSync` in the demo creates temp dirs that are never cleaned up.

**L3. Store uses synchronous fs operations**  
All file I/O is synchronous (`readFileSync`, `writeFileSync`). Fine for a CLI tool, but should be async if this becomes a long-running server.

**L4. No input validation on store operations**  
`ContentStore.put()` and `SpecStore.ingestDocument()` don't validate inputs. A non-hex ID or missing file would produce cryptic errors.

**L5. Warm hasher performance**  
`computeWarmHashes` iterates all canonical nodes for every clause (O(clauses × nodes)). Should build an index of clause→nodes first.

---

## 📊 Coverage Gaps

| Component | Unit Tests | Functional Tests | Gap |
|-----------|-----------|-----------------|-----|
| Normalizer | ✅ 12 | — | Missing: code blocks, nested markdown |
| Spec Parser | ✅ 11 | ✅ via ingestion | Missing: setext headings, pre-heading content |
| Semhash | ✅ 9 | — | — |
| Diff | ✅ 7 | ✅ via ingestion | Missing: large-scale diff (100+ clauses) |
| Canonicalizer | ✅ 13 | ✅ via canonicalization | — |
| Warm Hasher | ✅ 5 | ✅ via canonicalization | — |
| Classifier | ✅ 7 | ✅ via canonicalization | Missing: D-class exercise |
| D-Rate | ✅ 9 | ✅ via canonicalization | — |
| Bootstrap | ✅ 10 | ✅ via canonicalization | — |
| IU Planner | ✅ 7 | ✅ via IU pipeline | Missing: large spec with many clusters |
| Regen | ✅ 6 | ✅ via IU pipeline | — |
| Manifest | — | ✅ via IU pipeline | Missing: dedicated unit tests for ManifestManager |
| Drift | ✅ 5 | ✅ via IU pipeline | — |
| Dep Extractor | ✅ 10 | ✅ via IU pipeline | — |
| Boundary Validator | ✅ 12 | ✅ via IU pipeline | — |
| Content Store | — | ✅ via ingestion | Missing: dedicated unit tests |
| Spec Store | — | ✅ via ingestion | Missing: dedicated unit tests |
| Canonical Store | — | ✅ via canonicalization | Missing: dedicated unit tests |

---

## 🏗️ Recommendations for Phase D+

1. **Build a Provenance Store** before Evidence/Policy (Phase D) — the evidence engine needs provenance edges to bind evidence to the right graph nodes.

2. **Add a CLI entry point** (`phoenix bootstrap`, `phoenix status`, `phoenix ingest`) — the core logic is all functions/classes but there's no user-facing command.

3. **Add integration tests with the real PRD.md** — run the full A→C2 pipeline against the Phoenix PRD itself as a dogfood test.

4. **Consider property-based testing** for the normalizer and diff engine — these are the foundation and need to be bulletproof.

5. **Add structured logging** — every transformation should emit a structured log event that can reconstruct the provenance graph.