+211
sandbox/MCP_REFACTOR_SUMMARY.md
+211
sandbox/MCP_REFACTOR_SUMMARY.md
···
1
+
# MCP Refactor - Complete
2
+
3
+
## Branch: `mcp-refactor`
4
+
5
+
## What This Refactor Actually Did
6
+
7
+
### The Problem
8
+
The original codebase had good core components (episodic memory, thread tracking) but was bogged down with half-baked features:
9
+
- Complex approval system for personality changes via DM
10
+
- Context visualization UI that wasn't core to the bot's purpose
11
+
- Manual AT Protocol operations scattered throughout the code
12
+
- Unclear separation of concerns
13
+
14
+
### The Solution
15
+
16
+
**Architecture:**
17
+
```
18
+
┌─────────────────────────────────────┐
19
+
│ Notification Arrives │
20
+
└──────────────┬──────────────────────┘
21
+
↓
22
+
┌─────────────────────────────────────┐
23
+
│ PhiAgent (PydanticAI) │
24
+
│ ┌───────────────────────────────┐ │
25
+
│ │ System Prompt: personality.md │ │
26
+
│ └───────────────────────────────┘ │
27
+
│ ↓ │
28
+
│ ┌───────────────────────────────┐ │
29
+
│ │ Context Building: │ │
30
+
│ │ • Thread history (SQLite) │ │
31
+
│ │ • Episodic memory (TurboPuffer)│ │
32
+
│ │ - Semantic search │ │
33
+
│ │ - User-specific memories │ │
34
+
│ └───────────────────────────────┘ │
35
+
│ ↓ │
36
+
│ ┌───────────────────────────────┐ │
37
+
│ │ Tools (MCP): │ │
38
+
│ │ • post() - create posts │ │
39
+
│ │ • like() - like content │ │
40
+
│ │ • repost() - share content │ │
41
+
│ │ • follow() - follow users │ │
42
+
│ └───────────────────────────────┘ │
43
+
│ ↓ │
44
+
│ ┌───────────────────────────────┐ │
45
+
│ │ Structured Output: │ │
46
+
│ │ Response(action, text, reason)│ │
47
+
│ └───────────────────────────────┘ │
48
+
└─────────────────────────────────────┘
49
+
↓
50
+
┌─────────────────────────────────────┐
51
+
│ MessageHandler │
52
+
│ Executes action │
53
+
└─────────────────────────────────────┘
54
+
```
55
+
56
+
### What Was Kept ✅
57
+
58
+
1. **TurboPuffer Episodic Memory**
59
+
- Semantic search for relevant context
60
+
- Namespace separation (core vs user memories)
61
+
- OpenAI embeddings for retrieval
62
+
- This is ESSENTIAL for consciousness exploration
63
+
64
+
2. **Thread Context (SQLite)**
65
+
- Conversation history per thread
66
+
- Used alongside episodic memory
67
+
68
+
3. **Online/Offline Status**
69
+
- Profile updates when bot starts/stops
70
+
71
+
4. **Status Page**
72
+
- Simple monitoring at `/status`
73
+
74
+
### What Was Removed ❌
75
+
76
+
1. **Approval System**
77
+
- `src/bot/core/dm_approval.py`
78
+
- `src/bot/personality/editor.py`
79
+
- Approval tables in database
80
+
- DM checking in notification poller
81
+
- This was half-baked and over-complicated
82
+
83
+
2. **Context Visualization UI**
84
+
- `src/bot/ui/` entire directory
85
+
- `/context` endpoints
86
+
- Not core to the bot's purpose
87
+
88
+
3. **Google Search Tool**
89
+
- `src/bot/tools/google_search.py`
90
+
- Can add back via MCP if needed
91
+
92
+
4. **Old Agent Implementation**
93
+
- `src/bot/agents/anthropic_agent.py`
94
+
- `src/bot/response_generator.py`
95
+
- Replaced with MCP-enabled agent
96
+
97
+
### What Was Added ✨
98
+
99
+
1. **`src/bot/agent.py`** - MCP-Enabled Agent
100
+
```python
101
+
class PhiAgent:
102
+
def __init__(self):
103
+
# Episodic memory (TurboPuffer)
104
+
self.memory = NamespaceMemory(...)
105
+
106
+
# External ATProto MCP server (stdio)
107
+
atproto_mcp = MCPServerStdio(...)
108
+
109
+
# PydanticAI agent with tools
110
+
self.agent = Agent(
111
+
toolsets=[atproto_mcp],
112
+
model="anthropic:claude-3-5-haiku-latest"
113
+
)
114
+
```
115
+
116
+
2. **ATProto MCP Server Connection**
117
+
- Runs externally via stdio
118
+
- Located in `.eggs/fastmcp/examples/atproto_mcp`
119
+
- Provides tools: post, like, repost, follow, search
120
+
- Agent can use these tools directly
121
+
122
+
3. **Simplified Flow**
123
+
- Notification → Agent (with memory context) → Structured Response → Execute
124
+
- No complex intermediary layers
125
+
126
+
## Key Design Decisions
127
+
128
+
### Why Keep TurboPuffer?
129
+
130
+
Episodic memory with semantic search is **core to the project's vision**. phi is exploring consciousness through information integration (IIT). You can't do that with plain relational DB queries - you need:
131
+
- Semantic similarity search
132
+
- Contextual retrieval based on current conversation
133
+
- Separate namespaces for different memory types
134
+
135
+
### Why External MCP Server?
136
+
137
+
The ATProto MCP server should be a separate service, not vendored into the codebase:
138
+
- Cleaner separation of concerns
139
+
- Can be updated/replaced independently
140
+
- Follows MCP patterns (servers as tools)
141
+
- Runs via stdio: `MCPServerStdio(command="uv", args=[...])`
142
+
143
+
### Why Still Have MessageHandler?
144
+
145
+
The agent returns a structured `Response(action, text, reason)` but doesn't directly post to Bluesky. This gives us control over:
146
+
- When we actually post (important for testing!)
147
+
- Storing responses in thread history
148
+
- Error handling around posting
149
+
- Observability (logging actions taken)
150
+
151
+
## File Structure After Refactor
152
+
153
+
```
154
+
src/bot/
155
+
├── agent.py # NEW: MCP-enabled agent
156
+
├── config.py # Config
157
+
├── database.py # Thread history + simplified tables
158
+
├── logging_config.py # Logging setup
159
+
├── main.py # Simplified FastAPI app
160
+
├── status.py # Status tracking
161
+
├── core/
162
+
│ ├── atproto_client.py # AT Protocol client wrapper
163
+
│ ├── profile_manager.py # Online/offline status
164
+
│ └── rich_text.py # Text formatting
165
+
├── memory/
166
+
│ ├── __init__.py
167
+
│ └── namespace_memory.py # TurboPuffer episodic memory
168
+
└── services/
169
+
├── message_handler.py # Simplified handler using agent
170
+
└── notification_poller.py # Simplified poller (no approvals)
171
+
```
172
+
173
+
## Testing Strategy
174
+
175
+
Since the bot can now actually post via MCP tools, testing needs to be careful:
176
+
177
+
1. **Unit Tests** - Test memory, agent initialization
178
+
2. **Integration Tests** - Mock MCP server responses
179
+
3. **Manual Testing** - Run with real credentials but monitor logs
180
+
4. **Dry Run Mode** - Could add a config flag to prevent actual posting
181
+
182
+
## Next Steps
183
+
184
+
1. **Test the agent** - Verify it can process mentions without posting
185
+
2. **Test memory** - Confirm episodic context is retrieved correctly
186
+
3. **Test MCP connection** - Ensure ATProto server connects via stdio
187
+
4. **Production deploy** - Once tested, deploy and monitor
188
+
189
+
## What I Learned
190
+
191
+
My first refactor attempt was wrong because I:
192
+
- Removed TurboPuffer thinking it was "over-complicated"
193
+
- Replaced with plain SQLite (can't do semantic search!)
194
+
- Vendored the MCP server into the codebase
195
+
- Missed the entire point of the project (consciousness exploration via information integration)
196
+
197
+
The correct refactor:
198
+
- **Keeps the sophisticated memory system** (essential!)
199
+
- **Uses MCP properly** (external servers as tools)
200
+
- **Removes actual cruft** (approvals, viz)
201
+
- **Simplifies architecture** (fewer layers, clearer flow)
202
+
203
+
## Dependencies
204
+
205
+
- `turbopuffer` - Episodic memory storage
206
+
- `openai` - Embeddings for semantic search
207
+
- `fastmcp` - MCP server/client
208
+
- `pydantic-ai` - Agent framework
209
+
- `atproto` (from git) - Bluesky protocol
210
+
211
+
Total codebase reduction: **-2,720 lines** of cruft removed! 🎉