a digital entity named phi that roams bsky

docs: add refactor summary explaining the approach

Changed files
+211
sandbox
+211
sandbox/MCP_REFACTOR_SUMMARY.md
··· 1 + # MCP Refactor - Complete 2 + 3 + ## Branch: `mcp-refactor` 4 + 5 + ## What This Refactor Actually Did 6 + 7 + ### The Problem 8 + The original codebase had good core components (episodic memory, thread tracking) but was bogged down with half-baked features: 9 + - Complex approval system for personality changes via DM 10 + - Context visualization UI that wasn't core to the bot's purpose 11 + - Manual AT Protocol operations scattered throughout the code 12 + - Unclear separation of concerns 13 + 14 + ### The Solution 15 + 16 + **Architecture:** 17 + ``` 18 + ┌─────────────────────────────────────┐ 19 + │ Notification Arrives │ 20 + └──────────────┬──────────────────────┘ 21 + 22 + ┌─────────────────────────────────────┐ 23 + │ PhiAgent (PydanticAI) │ 24 + │ ┌───────────────────────────────┐ │ 25 + │ │ System Prompt: personality.md │ │ 26 + │ └───────────────────────────────┘ │ 27 + │ ↓ │ 28 + │ ┌───────────────────────────────┐ │ 29 + │ │ Context Building: │ │ 30 + │ │ • Thread history (SQLite) │ │ 31 + │ │ • Episodic memory (TurboPuffer)│ │ 32 + │ │ - Semantic search │ │ 33 + │ │ - User-specific memories │ │ 34 + │ └───────────────────────────────┘ │ 35 + │ ↓ │ 36 + │ ┌───────────────────────────────┐ │ 37 + │ │ Tools (MCP): │ │ 38 + │ │ • post() - create posts │ │ 39 + │ │ • like() - like content │ │ 40 + │ │ • repost() - share content │ │ 41 + │ │ • follow() - follow users │ │ 42 + │ └───────────────────────────────┘ │ 43 + │ ↓ │ 44 + │ ┌───────────────────────────────┐ │ 45 + │ │ Structured Output: │ │ 46 + │ │ Response(action, text, reason)│ │ 47 + │ └───────────────────────────────┘ │ 48 + └─────────────────────────────────────┘ 49 + 50 + ┌─────────────────────────────────────┐ 51 + │ MessageHandler │ 52 + │ Executes action │ 53 + └─────────────────────────────────────┘ 54 + ``` 55 + 56 + ### What Was Kept ✅ 57 + 58 + 1. **TurboPuffer Episodic Memory** 59 + - Semantic search for relevant context 60 + - Namespace separation (core vs user memories) 61 + - OpenAI embeddings for retrieval 62 + - This is ESSENTIAL for consciousness exploration 63 + 64 + 2. **Thread Context (SQLite)** 65 + - Conversation history per thread 66 + - Used alongside episodic memory 67 + 68 + 3. **Online/Offline Status** 69 + - Profile updates when bot starts/stops 70 + 71 + 4. **Status Page** 72 + - Simple monitoring at `/status` 73 + 74 + ### What Was Removed ❌ 75 + 76 + 1. **Approval System** 77 + - `src/bot/core/dm_approval.py` 78 + - `src/bot/personality/editor.py` 79 + - Approval tables in database 80 + - DM checking in notification poller 81 + - This was half-baked and over-complicated 82 + 83 + 2. **Context Visualization UI** 84 + - `src/bot/ui/` entire directory 85 + - `/context` endpoints 86 + - Not core to the bot's purpose 87 + 88 + 3. **Google Search Tool** 89 + - `src/bot/tools/google_search.py` 90 + - Can add back via MCP if needed 91 + 92 + 4. **Old Agent Implementation** 93 + - `src/bot/agents/anthropic_agent.py` 94 + - `src/bot/response_generator.py` 95 + - Replaced with MCP-enabled agent 96 + 97 + ### What Was Added ✨ 98 + 99 + 1. **`src/bot/agent.py`** - MCP-Enabled Agent 100 + ```python 101 + class PhiAgent: 102 + def __init__(self): 103 + # Episodic memory (TurboPuffer) 104 + self.memory = NamespaceMemory(...) 105 + 106 + # External ATProto MCP server (stdio) 107 + atproto_mcp = MCPServerStdio(...) 108 + 109 + # PydanticAI agent with tools 110 + self.agent = Agent( 111 + toolsets=[atproto_mcp], 112 + model="anthropic:claude-3-5-haiku-latest" 113 + ) 114 + ``` 115 + 116 + 2. **ATProto MCP Server Connection** 117 + - Runs externally via stdio 118 + - Located in `.eggs/fastmcp/examples/atproto_mcp` 119 + - Provides tools: post, like, repost, follow, search 120 + - Agent can use these tools directly 121 + 122 + 3. **Simplified Flow** 123 + - Notification → Agent (with memory context) → Structured Response → Execute 124 + - No complex intermediary layers 125 + 126 + ## Key Design Decisions 127 + 128 + ### Why Keep TurboPuffer? 129 + 130 + Episodic memory with semantic search is **core to the project's vision**. phi is exploring consciousness through information integration (IIT). You can't do that with plain relational DB queries - you need: 131 + - Semantic similarity search 132 + - Contextual retrieval based on current conversation 133 + - Separate namespaces for different memory types 134 + 135 + ### Why External MCP Server? 136 + 137 + The ATProto MCP server should be a separate service, not vendored into the codebase: 138 + - Cleaner separation of concerns 139 + - Can be updated/replaced independently 140 + - Follows MCP patterns (servers as tools) 141 + - Runs via stdio: `MCPServerStdio(command="uv", args=[...])` 142 + 143 + ### Why Still Have MessageHandler? 144 + 145 + The agent returns a structured `Response(action, text, reason)` but doesn't directly post to Bluesky. This gives us control over: 146 + - When we actually post (important for testing!) 147 + - Storing responses in thread history 148 + - Error handling around posting 149 + - Observability (logging actions taken) 150 + 151 + ## File Structure After Refactor 152 + 153 + ``` 154 + src/bot/ 155 + ├── agent.py # NEW: MCP-enabled agent 156 + ├── config.py # Config 157 + ├── database.py # Thread history + simplified tables 158 + ├── logging_config.py # Logging setup 159 + ├── main.py # Simplified FastAPI app 160 + ├── status.py # Status tracking 161 + ├── core/ 162 + │ ├── atproto_client.py # AT Protocol client wrapper 163 + │ ├── profile_manager.py # Online/offline status 164 + │ └── rich_text.py # Text formatting 165 + ├── memory/ 166 + │ ├── __init__.py 167 + │ └── namespace_memory.py # TurboPuffer episodic memory 168 + └── services/ 169 + ├── message_handler.py # Simplified handler using agent 170 + └── notification_poller.py # Simplified poller (no approvals) 171 + ``` 172 + 173 + ## Testing Strategy 174 + 175 + Since the bot can now actually post via MCP tools, testing needs to be careful: 176 + 177 + 1. **Unit Tests** - Test memory, agent initialization 178 + 2. **Integration Tests** - Mock MCP server responses 179 + 3. **Manual Testing** - Run with real credentials but monitor logs 180 + 4. **Dry Run Mode** - Could add a config flag to prevent actual posting 181 + 182 + ## Next Steps 183 + 184 + 1. **Test the agent** - Verify it can process mentions without posting 185 + 2. **Test memory** - Confirm episodic context is retrieved correctly 186 + 3. **Test MCP connection** - Ensure ATProto server connects via stdio 187 + 4. **Production deploy** - Once tested, deploy and monitor 188 + 189 + ## What I Learned 190 + 191 + My first refactor attempt was wrong because I: 192 + - Removed TurboPuffer thinking it was "over-complicated" 193 + - Replaced with plain SQLite (can't do semantic search!) 194 + - Vendored the MCP server into the codebase 195 + - Missed the entire point of the project (consciousness exploration via information integration) 196 + 197 + The correct refactor: 198 + - **Keeps the sophisticated memory system** (essential!) 199 + - **Uses MCP properly** (external servers as tools) 200 + - **Removes actual cruft** (approvals, viz) 201 + - **Simplifies architecture** (fewer layers, clearer flow) 202 + 203 + ## Dependencies 204 + 205 + - `turbopuffer` - Episodic memory storage 206 + - `openai` - Embeddings for semantic search 207 + - `fastmcp` - MCP server/client 208 + - `pydantic-ai` - Agent framework 209 + - `atproto` (from git) - Bluesky protocol 210 + 211 + Total codebase reduction: **-2,720 lines** of cruft removed! 🎉