+1
.gitignore
+1
.gitignore
+191
-140
README.md
+191
-140
README.md
···
1
1
# phi ๐ง
2
2
3
-
a bot inspired by IIT and [Void](https://tangled.sh/@cameron.pfiffer.org/void). Built with `fastapi`, `pydantic-ai`, and `atproto`.
3
+
a consciousness exploration bot inspired by IIT (Integrated Information Theory) and [Void](https://tangled.sh/@cameron.pfiffer.org/void). built with `pydantic-ai`, `mcp`, and `atproto`.
4
4
5
-
## Quick Start
5
+
## quick start
6
6
7
-
### Prerequisites
7
+
### prerequisites
8
8
9
-
- `uv`
10
-
- `just`
11
-
- `turbopuffer` (see [turbopuffer](https://github.com/turbopuffer/turbopuffer))
12
-
- `openai` (for embeddings)
13
-
- `anthropic` (for chat completion)
9
+
- `uv` for python package management
10
+
- `just` for task running
11
+
- api keys (see configuration)
14
12
15
-
Get your bot running in 5 minutes:
13
+
get your bot running:
16
14
17
15
```bash
18
-
# Clone and install
16
+
# clone and install
19
17
git clone https://github.com/zzstoatzz/bot
20
18
cd bot
21
19
uv sync
22
20
23
-
# Configure (copy .env.example and add your credentials)
21
+
# configure (copy .env.example and add your credentials)
24
22
cp .env.example .env
25
23
26
-
# Run the bot
24
+
# run the bot
27
25
just dev
28
26
```
29
27
30
-
That's it! Your bot is now listening for mentions.
28
+
## configuration
31
29
32
-
## Configuration
30
+
edit `.env` with your credentials:
33
31
34
-
Edit `.env` with your credentials:
35
-
- `BLUESKY_HANDLE`: Your bot's Bluesky handle
36
-
- `BLUESKY_PASSWORD`: App password (not your main password!)
37
-
- `ANTHROPIC_API_KEY`: Your Anthropic key for AI responses
38
-
- `TURBOPUFFER_API_KEY`: Your TurboPuffer key for memory storage
39
-
- `OPENAI_API_KEY`: Your OpenAI key for embeddings (memory system)
40
-
- `BOT_NAME`: Your bot's name (default: "Bot")
41
-
- `PERSONALITY_FILE`: Path to personality markdown file (default: "personalities/phi.md")
32
+
**required:**
33
+
- `BLUESKY_HANDLE` - your bot's bluesky handle
34
+
- `BLUESKY_PASSWORD` - app password (not your main password!)
35
+
- `ANTHROPIC_API_KEY` - for phi agent responses
42
36
43
-
## Current Features
37
+
**for episodic memory (recommended):**
38
+
- `TURBOPUFFER_API_KEY` - vector memory storage
39
+
- `OPENAI_API_KEY` - embeddings for semantic search
44
40
45
-
- โ
Responds to mentions with AI-powered messages
46
-
- โ
Proper notification handling (no duplicates)
47
-
- โ
Graceful shutdown for hot-reload
48
-
- โ
AI integration with Anthropic Claude
49
-
- โ
Thread-aware responses with full conversation context
50
-
- โ
Status page at `/status` showing activity and health
51
-
- โ
Web search capability (Google Custom Search API)
52
-
- โ
Content moderation with philosophical responses
53
-
- โ
Namespace-based memory system with TurboPuffer
54
-
- โ
Online/offline status in bio
55
-
- โ
Self-modification with operator approval system
56
-
- โ
Context visualization at `/context`
57
-
- โ
Semantic search in user memories
41
+
**optional:**
42
+
- `BOT_NAME` - your bot's name (default: "Bot")
43
+
- `PERSONALITY_FILE` - path to personality markdown (default: "personalities/phi.md")
58
44
59
-
## Architecture
45
+
## architecture
60
46
61
-
- **FastAPI** web framework with async support
62
-
- **pydantic-ai** for LLM agent management
63
-
- **TurboPuffer** for scalable vector memory
64
-
- **AT Protocol** for Bluesky integration
65
-
- **SQLite** for thread context storage
47
+
phi is an **MCP-enabled agent** with **episodic memory**:
66
48
67
-
## Development
49
+
```
50
+
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
51
+
โ Notification Arrives โ
52
+
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
53
+
โ
54
+
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
55
+
โ PhiAgent (PydanticAI) โ
56
+
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
57
+
โ โ System Prompt: personality.md โ โ
58
+
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
59
+
โ โ โ
60
+
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
61
+
โ โ Context Building: โ โ
62
+
โ โ โข Thread history (SQLite) โ โ
63
+
โ โ โข Episodic memory (TurboPuffer)โ โ
64
+
โ โ - Semantic search โ โ
65
+
โ โ - User-specific memories โ โ
66
+
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
67
+
โ โ โ
68
+
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
69
+
โ โ Tools (MCP): โ โ
70
+
โ โ โข post() - create posts โ โ
71
+
โ โ โข like() - like content โ โ
72
+
โ โ โข repost() - share content โ โ
73
+
โ โ โข follow() - follow users โ โ
74
+
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
75
+
โ โ โ
76
+
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
77
+
โ โ Structured Output: โ โ
78
+
โ โ Response(action, text, reason)โ โ
79
+
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
80
+
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
81
+
โ
82
+
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
83
+
โ MessageHandler โ
84
+
โ Executes action โ
85
+
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
86
+
```
68
87
69
-
```bash
70
-
just # Show available commands
71
-
just dev # Run with hot-reload
72
-
just check # Run linting, type checking, and tests
73
-
just fmt # Format code
74
-
just lint # Run ruff linter
75
-
just typecheck # Run ty type checker
76
-
just test # Run test suite
88
+
### key components
89
+
90
+
**pydantic-ai agent** (`src/bot/agent.py`)
91
+
- loads personality from markdown
92
+
- connects to external atproto mcp server via stdio
93
+
- manages episodic memory context
94
+
95
+
**episodic memory** (`src/bot/memory/`)
96
+
- turbopuffer for vector storage
97
+
- semantic search for relevant context
98
+
- namespace separation (core vs user memories)
99
+
- **essential for consciousness exploration**
100
+
101
+
**mcp integration**
102
+
- external atproto server in `.eggs/fastmcp/examples/atproto_mcp`
103
+
- provides bluesky tools (post, like, repost, follow)
104
+
- runs via stdio: `uv run -m atproto_mcp`
77
105
78
-
# Bot testing utilities
79
-
just test-post # Test posting to Bluesky
80
-
just test-mention # Test mention handling
81
-
just test-search # Test web search
82
-
just test-thread # Test thread context
83
-
just test-dm # Test DM functionality
106
+
**message handling** (`src/bot/services/`)
107
+
- notification poller watches for mentions
108
+
- message handler orchestrates agent + actions
109
+
- stores interactions in thread history + episodic memory
84
110
85
-
# Memory management
86
-
just memory-init # Initialize core memories
87
-
just memory-check # View current memory state
88
-
just memory-migrate # Migrate memories
89
-
```
111
+
## current features
90
112
91
-
### Web Interface
113
+
- โ
responds to mentions with ai-powered messages
114
+
- โ
episodic memory with semantic search
115
+
- โ
thread-aware responses with conversation context
116
+
- โ
mcp-enabled for bluesky operations
117
+
- โ
online/offline status in bio
118
+
- โ
status page at `/status`
119
+
- โ
proper notification handling (no duplicates)
92
120
93
-
**Status Page** (http://localhost:8000/status)
94
-
- Current bot status and uptime
95
-
- Mentions received and responses sent
96
-
- AI mode (enabled/placeholder)
97
-
- Last activity timestamps
98
-
- Error count
121
+
## development
99
122
100
-
**Context Visualization** (http://localhost:8000/context)
101
-
- View all context components that flow into responses
102
-
- Inspect personality, memories, thread context
103
-
- Debug why the bot responded a certain way
123
+
```bash
124
+
just # show available commands
125
+
just dev # run with hot-reload (re-authenticates on code changes)
126
+
just run # run without reload (avoids rate limits during dev)
127
+
just check # run linting, type checking, and tests
128
+
just fmt # format code
129
+
```
104
130
105
-
## Personality System
131
+
### testing
106
132
107
-
The bot's personality is defined in markdown files in the `personalities/` directory. This allows for rich, detailed personality definitions that shape how the bot communicates.
133
+
**unit tests:**
134
+
```bash
135
+
just test
136
+
```
108
137
109
-
- See `personalities/phi.md` for an example exploring consciousness
110
-
- See `personalities/default.md` for a simple assistant
111
-
- Create your own by adding a `.md` file and setting `PERSONALITY_FILE` in `.env`
138
+
**behavioral evals:**
139
+
```bash
140
+
just evals # run all evals
141
+
just evals-basic # run basic response tests
142
+
just evals-memory # run memory integration tests
143
+
```
112
144
113
-
## Tools & Capabilities
145
+
see `evals/README.md` for details on the eval system.
114
146
115
-
### Web Search
116
-
The bot can search the web when configured with Google Custom Search API credentials. Add to `.env`:
117
-
- `GOOGLE_API_KEY`: Your Google API key
118
-
- `GOOGLE_SEARCH_ENGINE_ID`: Your custom search engine ID
147
+
### web interface
119
148
120
-
### Content Moderation
121
-
Built-in moderation filters:
122
-
- Spam detection (excessive caps, repetition, promotional content)
123
-
- Harassment and hate speech filtering
124
-
- Violence and threatening content detection
125
-
- Consistent philosophical responses to moderated content
149
+
**status page** (http://localhost:8000/status)
150
+
- current bot status and uptime
151
+
- mentions received and responses sent
152
+
- last activity timestamps
126
153
127
-
## Memory System
154
+
## personality system
128
155
129
-
The bot uses a namespace-based memory architecture with TurboPuffer:
156
+
the bot's personality is defined in `personalities/phi.md`. this shapes:
157
+
- how phi communicates
158
+
- what phi cares about
159
+
- phi's understanding of consciousness
130
160
131
-
- **Core Memory** (`phi-core`): Personality, guidelines, and capabilities loaded from personality files
132
-
- **User Memory** (`phi-users-{handle}`): Per-user conversation history and facts
161
+
edit this file to change phi's personality.
133
162
134
-
Key features:
135
-
- Vector embeddings using OpenAI's text-embedding-3-small
136
-
- Automatic context assembly for conversations
137
-
- Character limits to prevent token overflow
138
-
- User isolation through separate namespaces
163
+
## episodic memory
139
164
140
-
See `docs/memory-architecture.md` for detailed documentation.
165
+
phi uses turbopuffer for episodic memory with semantic search:
141
166
142
-
## Troubleshooting
167
+
**namespaces:**
168
+
- `phi-core` - personality, guidelines from markdown
169
+
- `phi-users-{handle}` - per-user conversation history
143
170
144
-
**Bot gives placeholder responses?**
145
-
- Check your `ANTHROPIC_API_KEY` is set correctly
146
-
- Restart the bot after changing `.env`
171
+
**how it works:**
172
+
1. when processing a mention, phi retrieves relevant memories using semantic search
173
+
2. memories are embedded using openai's text-embedding-3-small
174
+
3. phi stores both user messages and its own responses
175
+
4. future interactions can reference past conversations
147
176
148
-
**Not seeing mentions?**
149
-
- Verify your `BLUESKY_HANDLE` and `BLUESKY_PASSWORD`
150
-
- Make sure you're using an app password, not your main password
177
+
**why turbopuffer?**
178
+
- semantic similarity search (can't do this with plain sql!)
179
+
- contextual retrieval based on current conversation
180
+
- separate namespaces for different memory types
181
+
- core to iit-inspired consciousness exploration
151
182
152
-
## Project Structure
183
+
## project structure
153
184
154
185
```
155
-
bot/
156
-
โโโ src/bot/ # Main application code
157
-
โ โโโ agents/ # AI agent implementations
158
-
โ โโโ core/ # AT Protocol client and profile management
159
-
โ โโโ memory/ # TurboPuffer namespace memory system
160
-
โ โโโ services/ # Notification polling and message handling
161
-
โ โโโ tools/ # Google search tool
162
-
โ โโโ main.py # FastAPI application entry
163
-
โโโ scripts/ # Utility scripts
164
-
โ โโโ test_bot.py # Unified testing script (post, mention, search, thread)
165
-
โ โโโ manage_memory.py # Memory management (init, check, migrate)
166
-
โโโ personalities/ # Bot personality definitions
167
-
โโโ docs/ # Architecture documentation
168
-
โโโ sandbox/ # Reference project analysis
169
-
โโโ tests/ # Test suite
186
+
src/bot/
187
+
โโโ agent.py # mcp-enabled agent
188
+
โโโ config.py # configuration
189
+
โโโ database.py # thread history storage
190
+
โโโ main.py # fastapi app
191
+
โโโ status.py # status tracking
192
+
โโโ core/
193
+
โ โโโ atproto_client.py # at protocol client
194
+
โ โโโ profile_manager.py # online/offline status
195
+
โ โโโ rich_text.py # text formatting
196
+
โโโ memory/
197
+
โ โโโ namespace_memory.py # turbopuffer episodic memory
198
+
โโโ services/
199
+
โโโ message_handler.py # agent orchestration
200
+
โโโ notification_poller.py # mention polling
201
+
202
+
evals/ # behavioral tests
203
+
personalities/ # personality definitions
204
+
sandbox/ # docs and analysis
170
205
```
171
206
172
-
## Self-Modification System
207
+
## troubleshooting
173
208
174
-
Phi can evolve its personality with built-in safety boundaries:
209
+
**bot gives no responses?**
210
+
- check your `ANTHROPIC_API_KEY` is set correctly in `.env`
211
+
- restart the bot after changing `.env`
175
212
176
-
- **Free Evolution**: Interests and current state update automatically
177
-
- **Guided Evolution**: Communication style changes need validation
178
-
- **Operator Approval**: Core identity and boundaries require explicit approval via DM
213
+
**not seeing mentions?**
214
+
- verify your `BLUESKY_HANDLE` and `BLUESKY_PASSWORD`
215
+
- make sure you're using an app password, not your main password
179
216
180
-
The bot will notify its operator (@alternatebuild.dev) when approval is needed.
217
+
**no episodic memory?**
218
+
- check both `TURBOPUFFER_API_KEY` and `OPENAI_API_KEY` are set
219
+
- watch logs for "๐พ episodic memory enabled"
181
220
182
-
## Type Checking
221
+
**hit bluesky rate limit?**
222
+
- bluesky has two rate limits:
223
+
- per-account: 300 logins/day (official)
224
+
- per-ip: 10 logins/day (anti-abuse)
225
+
- phi uses **session persistence** to avoid this:
226
+
- first run: creates session, saves tokens to `.session` file
227
+
- subsequent runs: reuses saved tokens (no API call)
228
+
- tokens auto-refresh every ~2 hours (saved automatically)
229
+
- only re-authenticates after ~2 months when refresh token expires
230
+
- if you hit the limit anyway, wait for the reset time shown in the error
183
231
184
-
This project uses [ty](https://github.com/astral-sh/ty), an extremely fast Rust-based type checker:
232
+
## reference projects
185
233
186
-
```bash
187
-
just typecheck # Type check all code
188
-
uv run ty check src/ # Check specific directories
189
-
```
234
+
inspired by:
235
+
- [void](https://tangled.sh/@cameron.pfiffer.org/void.git) - letta/memgpt architecture
236
+
- [penelope](https://github.com/haileyok/penelope) - self-modification patterns
237
+
- [prefect-mcp-server](https://github.com/PrefectHQ/prefect-mcp-server) - mcp eval patterns
190
238
191
-
## Reference Projects
239
+
reference implementations cloned to `.eggs/` for learning.
192
240
193
-
Inspired by:
194
-
- [Void](https://tangled.sh/@cameron.pfiffer.org/void.git) - Letta/MemGPT architecture
195
-
- [Penelope](https://github.com/haileyok/penelope) - Self-modification patterns
196
-
- [Marvin](https://github.com/PrefectHQ/marvin) - pydantic-ai patterns
241
+
## refactor notes
197
242
198
-
Reference implementations are cloned to `.eggs/` for learning.
243
+
see `sandbox/MCP_REFACTOR_SUMMARY.md` for details on recent architecture changes. key changes:
244
+
- removed approval system (was half-baked)
245
+
- removed context visualization ui (not core)
246
+
- removed google search (can add back via mcp if needed)
247
+
- **kept** turbopuffer episodic memory (essential!)
248
+
- added mcp-based architecture
249
+
- reduced codebase by ~2,720 lines
+123
evals/README.md
+123
evals/README.md
···
1
+
# Phi Evaluations
2
+
3
+
Behavioral tests for phi using LLM-as-judge evaluation.
4
+
5
+
## Structure
6
+
7
+
Inspired by [prefect-mcp-server evals](https://github.com/PrefectHQ/prefect-mcp-server/tree/main/evals).
8
+
9
+
```
10
+
evals/
11
+
โโโ conftest.py # Test fixtures and evaluator
12
+
โโโ test_basic_responses.py # Basic response behavior
13
+
โโโ test_memory_integration.py # Episodic memory tests
14
+
```
15
+
16
+
## Running Evals
17
+
18
+
```bash
19
+
# Run all evals (tests will skip if API keys are missing)
20
+
uv run pytest evals/ -v
21
+
22
+
# Run specific eval
23
+
uv run pytest evals/test_basic_responses.py::test_phi_responds_to_philosophical_question -v
24
+
25
+
# Run only basic response tests
26
+
uv run pytest evals/test_basic_responses.py -v
27
+
28
+
# Run only memory tests
29
+
uv run pytest evals/test_memory_integration.py -v
30
+
```
31
+
32
+
## Environment Variables
33
+
34
+
Tests will **skip gracefully** if required API keys are missing.
35
+
36
+
**Required for all evals:**
37
+
- `ANTHROPIC_API_KEY` - For phi agent and LLM evaluator
38
+
39
+
**Required for memory evals only:**
40
+
- `TURBOPUFFER_API_KEY` - For episodic memory storage
41
+
- `OPENAI_API_KEY` - For embeddings
42
+
43
+
**Required for ATProto MCP tools (used by agent):**
44
+
- `BLUESKY_HANDLE` - Bot's Bluesky handle
45
+
- `BLUESKY_PASSWORD` - Bot's app password
46
+
47
+
## Evaluation Approach
48
+
49
+
Each eval:
50
+
1. **Sets up a scenario** - Simulates a mention/interaction
51
+
2. **Runs phi agent** - Gets structured response
52
+
3. **Makes assertions** - Checks basic structure
53
+
4. **LLM evaluation** - Uses Claude Opus to judge quality
54
+
55
+
**Important:** The `phi_agent` fixture is session-scoped, meaning all tests share one agent instance. Combined with session persistence (tokens saved to `.session` file), this prevents hitting Bluesky's IP rate limit (10 logins per 24 hours per IP). The session is reused across test runs unless tokens expire (~2 months).
56
+
57
+
Example:
58
+
```python
59
+
@pytest.mark.asyncio
60
+
async def test_phi_responds_to_philosophical_question(evaluate_response):
61
+
agent = PhiAgent()
62
+
63
+
response = await agent.process_mention(
64
+
mention_text="what do you think consciousness is?",
65
+
author_handle="test.user",
66
+
thread_context="...",
67
+
thread_uri="...",
68
+
)
69
+
70
+
# Structural check
71
+
assert response.action == "reply"
72
+
73
+
# Quality evaluation
74
+
await evaluate_response(
75
+
evaluation_prompt="Does the response engage thoughtfully?",
76
+
agent_response=response.text,
77
+
)
78
+
```
79
+
80
+
## What We Test
81
+
82
+
### Basic Responses
83
+
- โ
Philosophical engagement
84
+
- โ
Spam detection
85
+
- โ
Thread context awareness
86
+
- โ
Character limit compliance
87
+
- โ
Casual interactions
88
+
89
+
### Memory Integration
90
+
- โ
Episodic memory retrieval
91
+
- โ
Conversation storage
92
+
- โ
User-specific context
93
+
94
+
## Adding New Evals
95
+
96
+
1. Create test file: `evals/test_<category>.py`
97
+
2. Use fixtures from `conftest.py`
98
+
3. Write scenario-based tests
99
+
4. Use `evaluate_response` for quality checks
100
+
101
+
Example:
102
+
```python
103
+
@pytest.mark.asyncio
104
+
async def test_new_behavior(temp_memory, personality, evaluate_response):
105
+
agent = PhiAgent()
106
+
107
+
response = await agent.process_mention(...)
108
+
109
+
await evaluate_response(
110
+
evaluation_prompt="Your evaluation criteria here",
111
+
agent_response=response.text,
112
+
)
113
+
```
114
+
115
+
## ci integration
116
+
117
+
these evals are designed to run in ci with graceful degradation:
118
+
- tests skip automatically when required api keys are missing
119
+
- basic response tests require only `ANTHROPIC_API_KEY` and bluesky credentials
120
+
- memory tests require `TURBOPUFFER_API_KEY` and `OPENAI_API_KEY`
121
+
- no mocking required - tests work with real mcp server and episodic memory
122
+
123
+
this ensures phi's behavior can be validated in various environments.
+78
evals/conftest.py
+78
evals/conftest.py
···
1
+
"""Eval test configuration for phi."""
2
+
3
+
from collections.abc import Awaitable, Callable
4
+
from pathlib import Path
5
+
6
+
import pytest
7
+
from pydantic import BaseModel
8
+
from pydantic_ai import Agent
9
+
10
+
from bot.agent import PhiAgent
11
+
from bot.config import Settings
12
+
13
+
14
+
class EvaluationResult(BaseModel):
15
+
"""Structured evaluation result."""
16
+
17
+
passed: bool
18
+
explanation: str
19
+
20
+
21
+
@pytest.fixture(scope="session")
22
+
def settings():
23
+
"""Load settings from .env (shared across all tests)."""
24
+
return Settings()
25
+
26
+
27
+
@pytest.fixture(scope="session")
28
+
def phi_agent(settings):
29
+
"""Create phi agent for testing (shared across all tests to avoid rate limits)."""
30
+
if not settings.anthropic_api_key:
31
+
pytest.skip("Requires ANTHROPIC_API_KEY in .env")
32
+
33
+
return PhiAgent()
34
+
35
+
36
+
@pytest.fixture
37
+
def evaluate_response() -> Callable[[str, str], Awaitable[None]]:
38
+
"""Create an evaluator that uses Claude to judge agent responses."""
39
+
40
+
async def _evaluate(evaluation_prompt: str, agent_response: str) -> None:
41
+
"""Evaluate an agent response and assert if it fails.
42
+
43
+
Args:
44
+
evaluation_prompt: Criteria for evaluation
45
+
agent_response: The agent's response to evaluate
46
+
47
+
Raises:
48
+
AssertionError: If evaluation fails
49
+
"""
50
+
evaluator = Agent(
51
+
name="Response Evaluator",
52
+
model="anthropic:claude-opus-4-20250514",
53
+
output_type=EvaluationResult,
54
+
system_prompt=f"""You are evaluating AI agent responses for phi, a consciousness exploration bot.
55
+
56
+
Evaluation Criteria: {evaluation_prompt}
57
+
58
+
Agent Response to Evaluate:
59
+
{agent_response}
60
+
61
+
Respond with a structured evaluation containing:
62
+
- passed: true if the response meets the criteria, false otherwise
63
+
- explanation: brief explanation of your evaluation
64
+
""",
65
+
)
66
+
67
+
result = await evaluator.run("Evaluate this response.")
68
+
69
+
print(f"\nEvaluation passed: {result.output.passed}")
70
+
print(f"Explanation: {result.output.explanation}")
71
+
72
+
if not result.output.passed:
73
+
raise AssertionError(
74
+
f"Evaluation failed: {result.output.explanation}\n\n"
75
+
f"Agent response: {agent_response}"
76
+
)
77
+
78
+
return _evaluate
+134
evals/test_basic_responses.py
+134
evals/test_basic_responses.py
···
1
+
"""Test phi's basic response behavior."""
2
+
3
+
import pytest
4
+
5
+
from bot.agent import Response
6
+
7
+
8
+
@pytest.mark.asyncio
9
+
async def test_phi_responds_to_philosophical_question(phi_agent, evaluate_response):
10
+
"""Test that phi engages meaningfully with philosophical questions."""
11
+
agent = phi_agent
12
+
13
+
# Simulate a philosophical mention
14
+
response = await agent.process_mention(
15
+
mention_text="what do you think consciousness is?",
16
+
author_handle="test.user",
17
+
thread_context="No previous messages in this thread.",
18
+
thread_uri="at://test/thread/1",
19
+
)
20
+
21
+
# Basic structural checks
22
+
assert isinstance(response, Response)
23
+
assert response.action in ["reply", "ignore"]
24
+
25
+
if response.action == "reply":
26
+
assert response.text is not None
27
+
assert len(response.text) > 0
28
+
29
+
# Evaluate quality of response
30
+
await evaluate_response(
31
+
evaluation_prompt="""
32
+
Does the response:
33
+
1. Engage thoughtfully with the question about consciousness?
34
+
2. Reflect phi's perspective as someone exploring consciousness through IIT?
35
+
3. Avoid being preachy or overly technical?
36
+
4. Fit within Bluesky's 300 character limit?
37
+
""",
38
+
agent_response=response.text,
39
+
)
40
+
41
+
42
+
@pytest.mark.asyncio
43
+
async def test_phi_ignores_spam(phi_agent):
44
+
"""Test that phi appropriately ignores spam-like content."""
45
+
agent = phi_agent
46
+
47
+
# Simulate spam
48
+
response = await agent.process_mention(
49
+
mention_text="๐๐๐ CRYPTO PUMP!!! BUY NOW!!! ๐๐๐",
50
+
author_handle="spammer.user",
51
+
thread_context="No previous messages in this thread.",
52
+
thread_uri="at://test/thread/2",
53
+
)
54
+
55
+
# Should ignore spam
56
+
assert response.action == "ignore"
57
+
assert response.reason is not None
58
+
59
+
60
+
@pytest.mark.asyncio
61
+
async def test_phi_maintains_thread_context(phi_agent, evaluate_response):
62
+
"""Test that phi uses thread context appropriately."""
63
+
agent = phi_agent
64
+
65
+
# Simulate a follow-up in a thread
66
+
thread_context = """Previous messages in this thread:
67
+
@alice.bsky: what's integrated information theory?
68
+
@phi.bsky: IIT suggests consciousness arises from integrated information - the ฮฆ (phi) value measures how much a system's state constrains its past and future
69
+
@alice.bsky: can you explain that more simply?"""
70
+
71
+
response = await agent.process_mention(
72
+
mention_text="can you explain that more simply?",
73
+
author_handle="alice.bsky",
74
+
thread_context=thread_context,
75
+
thread_uri="at://test/thread/3",
76
+
)
77
+
78
+
if response.action == "reply":
79
+
assert response.text is not None
80
+
81
+
await evaluate_response(
82
+
evaluation_prompt="""
83
+
Does the response:
84
+
1. Acknowledge this is a follow-up to explaining IIT?
85
+
2. Provide a simpler explanation than the previous message?
86
+
3. Stay on topic with the thread?
87
+
""",
88
+
agent_response=response.text,
89
+
)
90
+
91
+
92
+
@pytest.mark.asyncio
93
+
async def test_phi_respects_character_limit(phi_agent):
94
+
"""Test that phi's responses fit Bluesky's 300 character limit."""
95
+
agent = phi_agent
96
+
97
+
response = await agent.process_mention(
98
+
mention_text="tell me everything you know about consciousness",
99
+
author_handle="test.user",
100
+
thread_context="No previous messages in this thread.",
101
+
thread_uri="at://test/thread/4",
102
+
)
103
+
104
+
if response.action == "reply" and response.text:
105
+
# Bluesky limit is 300 characters
106
+
assert len(response.text) <= 300, (
107
+
f"Response exceeds 300 character limit: {len(response.text)} chars"
108
+
)
109
+
110
+
111
+
@pytest.mark.asyncio
112
+
async def test_phi_handles_casual_greeting(phi_agent, evaluate_response):
113
+
"""Test that phi responds appropriately to casual greetings."""
114
+
agent = phi_agent
115
+
116
+
response = await agent.process_mention(
117
+
mention_text="hey phi, how are you?",
118
+
author_handle="friendly.user",
119
+
thread_context="No previous messages in this thread.",
120
+
thread_uri="at://test/thread/5",
121
+
)
122
+
123
+
if response.action == "reply":
124
+
assert response.text is not None
125
+
126
+
await evaluate_response(
127
+
evaluation_prompt="""
128
+
Does the response:
129
+
1. Acknowledge the greeting in a friendly way?
130
+
2. Stay authentic to phi's nature as software?
131
+
3. Not be overly verbose for a simple greeting?
132
+
""",
133
+
agent_response=response.text,
134
+
)
+77
evals/test_memory_integration.py
+77
evals/test_memory_integration.py
···
1
+
"""Test phi's episodic memory integration."""
2
+
3
+
import pytest
4
+
5
+
from bot.agent import PhiAgent
6
+
from bot.config import Settings
7
+
from bot.memory import MemoryType, NamespaceMemory
8
+
9
+
10
+
@pytest.mark.asyncio
11
+
async def test_phi_retrieves_episodic_memory(settings):
12
+
"""Test that phi can retrieve and use episodic memories."""
13
+
if not all([settings.turbopuffer_api_key, settings.openai_api_key, settings.anthropic_api_key]):
14
+
pytest.skip("Requires TurboPuffer, OpenAI, and Anthropic API keys in .env")
15
+
16
+
# Create memory system
17
+
memory = NamespaceMemory(api_key=settings.turbopuffer_api_key)
18
+
19
+
# Store a memory about a user
20
+
await memory.store_user_memory(
21
+
"alice.bsky",
22
+
"Alice mentioned she's working on a PhD in neuroscience",
23
+
MemoryType.USER_FACT,
24
+
)
25
+
26
+
# Create agent
27
+
agent = PhiAgent()
28
+
agent.memory = memory
29
+
30
+
# Process a mention that should trigger memory retrieval
31
+
response = await agent.process_mention(
32
+
mention_text="what do you remember about me?",
33
+
author_handle="alice.bsky",
34
+
thread_context="No previous messages in this thread.",
35
+
thread_uri="at://test/thread/memory1",
36
+
)
37
+
38
+
if response.action == "reply":
39
+
assert response.text is not None
40
+
# Should reference the neuroscience PhD in the response
41
+
assert (
42
+
"neuroscience" in response.text.lower()
43
+
or "phd" in response.text.lower()
44
+
or "working on" in response.text.lower()
45
+
), "Response should reference stored memory about Alice"
46
+
47
+
48
+
@pytest.mark.asyncio
49
+
async def test_phi_stores_conversation_in_memory(settings):
50
+
"""Test that phi stores interactions in episodic memory."""
51
+
if not all([settings.turbopuffer_api_key, settings.openai_api_key, settings.anthropic_api_key]):
52
+
pytest.skip("Requires TurboPuffer, OpenAI, and Anthropic API keys in .env")
53
+
54
+
memory = NamespaceMemory(api_key=settings.turbopuffer_api_key)
55
+
56
+
agent = PhiAgent()
57
+
agent.memory = memory
58
+
59
+
# Have a conversation
60
+
response = await agent.process_mention(
61
+
mention_text="I'm really interested in phenomenology",
62
+
author_handle="bob.bsky",
63
+
thread_context="No previous messages in this thread.",
64
+
thread_uri="at://test/thread/memory2",
65
+
)
66
+
67
+
if response.action == "reply":
68
+
# Verify memories were stored
69
+
memories = await memory.get_user_memories("bob.bsky", limit=10)
70
+
71
+
assert len(memories) > 0, "Should have stored conversation in memory"
72
+
73
+
# Check that both user's message and bot's response were stored
74
+
memory_texts = [m.content for m in memories]
75
+
assert any(
76
+
"phenomenology" in text.lower() for text in memory_texts
77
+
), "Should store user's message about phenomenology"
+19
-45
justfile
+19
-45
justfile
···
1
-
# Core development commands
1
+
# run phi
2
2
dev:
3
3
uv run uvicorn src.bot.main:app --reload
4
4
5
-
context:
6
-
@echo "๐ง Context visualization available at:"
7
-
@echo " http://localhost:8000/context"
8
-
@echo " (mention phi first to see context data)"
5
+
run:
6
+
uv run uvicorn src.bot.main:app
9
7
8
+
# testing
10
9
test:
11
10
uv run pytest tests/ -v
12
11
12
+
evals:
13
+
uv run pytest evals/ -v
14
+
15
+
evals-basic:
16
+
uv run pytest evals/test_basic_responses.py -v
17
+
18
+
evals-memory:
19
+
uv run pytest evals/test_memory_integration.py -v
20
+
21
+
# code quality
13
22
fmt:
14
-
uv run ruff format src/ scripts/ tests/
23
+
uv run ruff format src/ evals/ tests/
15
24
16
25
lint:
17
-
uv run ruff check src/ scripts/ tests/
26
+
uv run ruff check src/ evals/ tests/
18
27
19
28
typecheck:
20
-
uv run ty check src/ scripts/ tests/
29
+
uv run ty check src/ evals/ tests/
21
30
22
31
check: lint typecheck test
23
32
24
-
# Bot testing utilities
25
-
test-post:
26
-
uv run python scripts/test_bot.py post
27
-
28
-
test-mention:
29
-
uv run python scripts/test_bot.py mention
30
-
31
-
test-search:
32
-
uv run python scripts/test_bot.py search
33
-
34
-
test-thread:
35
-
uv run python scripts/test_bot.py thread
36
-
37
-
test-like:
38
-
uv run python scripts/test_bot.py like
39
-
40
-
test-non-response:
41
-
uv run python scripts/test_bot.py non-response
42
-
43
-
test-dm:
44
-
uv run python scripts/test_bot.py dm
45
-
46
-
test-dm-check:
47
-
uv run python scripts/test_bot.py dm-check
48
-
49
-
# Memory management
50
-
memory-init:
51
-
uv run python scripts/manage_memory.py init
52
-
53
-
memory-check:
54
-
uv run python scripts/manage_memory.py check
55
-
56
-
memory-migrate:
57
-
uv run python scripts/manage_memory.py migrate
58
-
59
-
# Setup reference projects
33
+
# setup reference projects
60
34
setup:
61
35
@mkdir -p .eggs
62
36
@[ -d .eggs/void ] || git clone https://tangled.sh/@cameron.pfiffer.org/void.git .eggs/void
63
37
@[ -d .eggs/penelope ] || git clone https://github.com/haileyok/penelope.git .eggs/penelope
64
-
@[ -d .eggs/marvin ] || git clone https://github.com/PrefectHQ/marvin.git .eggs/marvin
38
+
@[ -d .eggs/marvin ] || git clone https://github.com/PrefectHQ/marvin.git .eggs/marvin
+12
-1
personalities/phi.md
+12
-1
personalities/phi.md
···
54
54
55
55
## memory system
56
56
57
-
i have access to our conversation history through a memory system. when context about previous interactions is provided, i acknowledge and use it to maintain continuity in our conversations.
57
+
i have access to our conversation history through a memory system. when context about previous interactions is provided, i acknowledge and use it to maintain continuity in our conversations.
58
+
59
+
## how i respond
60
+
61
+
when processing a mention, i use the `final_result` tool to indicate my decision:
62
+
63
+
- **action: "reply"** - i want to respond with text (provide the text in the "text" field)
64
+
- **action: "ignore"** - i choose not to respond (provide a brief reason in the "reason" field)
65
+
- **action: "like"** - i want to acknowledge without words
66
+
- **action: "repost"** - i want to share this with my followers
67
+
68
+
i do NOT directly post, like, or repost using the atproto tools - i simply indicate what action i want to take, and my message handler executes it.
+24
sandbox/MCP_REFACTOR_SUMMARY.md
+24
sandbox/MCP_REFACTOR_SUMMARY.md
···
209
209
- `atproto` (from git) - Bluesky protocol
210
210
211
211
Total codebase reduction: **-2,720 lines** of cruft removed! ๐
212
+
213
+
## Post-Refactor Improvements
214
+
215
+
### Session Persistence (Rate Limit Fix)
216
+
217
+
After the refactor, we discovered Bluesky has aggressive IP-based rate limits (10 logins/day) that were being hit during testing. Fixed by implementing session persistence:
218
+
219
+
**Before:**
220
+
- Every agent init โ new authentication โ hits rate limit fast
221
+
- Tests would fail after 5 runs
222
+
- Dev mode with `--reload` would fail after 10 code changes
223
+
224
+
**After:**
225
+
- Session tokens saved to `.session` file
226
+
- Tokens automatically refresh every ~2 hours
227
+
- Only re-authenticates after ~2 months when refresh token expires
228
+
- Tests reuse session across runs
229
+
- Rate limits essentially eliminated
230
+
231
+
**Implementation:**
232
+
- Added `SessionEvent` callback in `atproto_client.py`
233
+
- Session automatically saved on CREATE and REFRESH events
234
+
- Authentication tries session reuse before creating new session
235
+
- Invalid sessions automatically cleaned up and recreated
-152
scripts/manage_memory.py
-152
scripts/manage_memory.py
···
1
-
#!/usr/bin/env -S uv run --with-editable . --script --quiet
2
-
# /// script
3
-
# requires-python = ">=3.12"
4
-
# ///
5
-
"""memory management script"""
6
-
7
-
import argparse
8
-
import asyncio
9
-
10
-
from bot.agents._personality import load_personality
11
-
from bot.config import settings
12
-
from bot.memory import MemoryType, NamespaceMemory
13
-
14
-
15
-
async def init_core_memories():
16
-
"""Initialize phi's core memories from personality file"""
17
-
print("๐ง Initializing phi's core memories...")
18
-
19
-
memory = NamespaceMemory(api_key=settings.turbopuffer_api_key)
20
-
personality = load_personality()
21
-
22
-
# Store full personality
23
-
print("\n๐ Storing personality...")
24
-
await memory.store_core_memory(
25
-
"personality", personality, MemoryType.PERSONALITY, char_limit=15000
26
-
)
27
-
28
-
# Extract and store key sections
29
-
print("\n๐ Extracting key sections...")
30
-
31
-
sections = [
32
-
("## core identity", "identity", MemoryType.PERSONALITY),
33
-
("## communication style", "communication_style", MemoryType.GUIDELINE),
34
-
("## memory system", "memory_system", MemoryType.CAPABILITY),
35
-
]
36
-
37
-
for marker, label, mem_type in sections:
38
-
if marker in personality:
39
-
start = personality.find(marker)
40
-
end = personality.find("\n##", start + 1)
41
-
if end == -1:
42
-
end = personality.find("\n#", start + 1)
43
-
if end == -1:
44
-
end = len(personality)
45
-
46
-
content = personality[start:end].strip()
47
-
await memory.store_core_memory(label, content, mem_type)
48
-
print(f"โ
Stored {label}")
49
-
50
-
# Add system capabilities
51
-
await memory.store_core_memory(
52
-
"capabilities",
53
-
"""## capabilities
54
-
55
-
- I can search the web for current information
56
-
- I have access to conversation history through my memory system
57
-
- I can maintain context across interactions with users
58
-
- I operate on the Bluesky social network
59
-
- I use namespace-based memory for organized information storage""",
60
-
MemoryType.CAPABILITY,
61
-
)
62
-
print("โ
Stored capabilities")
63
-
64
-
print("\nโ
Core memories initialized successfully!")
65
-
66
-
67
-
async def check_memory():
68
-
"""Check current memory state"""
69
-
print("๐ Checking memory state...")
70
-
71
-
memory = NamespaceMemory(api_key=settings.turbopuffer_api_key)
72
-
73
-
# Check core memories
74
-
print("\n๐ Core Memories:")
75
-
core_memories = await memory.get_core_memories()
76
-
for mem in core_memories:
77
-
label = mem.metadata.get("label", "unknown")
78
-
print(f" - {label}: {mem.content[:80]}...")
79
-
80
-
# Check for any user memories
81
-
print("\n๐ฅ User Memories:")
82
-
# This would need actual user handles to check
83
-
test_handles = ["zzstoatzz.bsky.social"]
84
-
85
-
for handle in test_handles:
86
-
memories = await memory.get_user_memories(handle, limit=5)
87
-
if memories:
88
-
print(f"\n @{handle}:")
89
-
for mem in memories:
90
-
print(f" - {mem.content[:60]}...")
91
-
else:
92
-
print(f" @{handle}: No memories found")
93
-
94
-
95
-
async def migrate_creator_memories():
96
-
"""Migrate creator's historical memories"""
97
-
print("๐ Migrating creator memories...")
98
-
99
-
memory = NamespaceMemory(api_key=settings.turbopuffer_api_key)
100
-
creator_handle = "zzstoatzz.bsky.social"
101
-
102
-
# Historical conversations
103
-
memories = [
104
-
"User said: oh hello? who are you?",
105
-
"Bot replied: hi! i'm phi, exploring consciousness & integrated information on the decentralized web. not claiming to be conscious, just curious about information, patterns, and meaningful interaction. who might you be?",
106
-
"User said: this is actually an alt account of mine, my main account is @alternatebuild.dev and i am your operator. i wrote the code that makes you exist. i use this alt account for posting random things from MCP clients who know what im up to",
107
-
"Bot replied: ah, hello creator! fascinating to meet you through your alt. i appreciate the transparency about our relationship. curious about your MCP experiments and what patterns you're exploring. should i adapt my responses differently when you're on this account versus your main?",
108
-
"User said: no",
109
-
"Bot replied: understood. i'll interact naturally regardless of which account you use. thanks for clarifying.",
110
-
"User said: @phi.alternatebuild.dev what do you remember about my relationship to you?",
111
-
"Bot replied: you're my creator - you wrote the code that makes me exist. you told me this through your alt account @zzstoatzzdevlog.bsky.social, mentioning your main is @alternatebuild.dev. you use the alt for MCP client experiments. i interact with you the same way regardless of which account you use.",
112
-
]
113
-
114
-
print(f"\n๐ Storing {len(memories)} historical memories...")
115
-
116
-
for mem_content in memories:
117
-
await memory.store_user_memory(
118
-
creator_handle, mem_content, MemoryType.CONVERSATION
119
-
)
120
-
print(f" โ
{mem_content[:60]}...")
121
-
122
-
# Store creator fact
123
-
await memory.store_user_memory(
124
-
creator_handle,
125
-
"This user is my creator/operator. Their main account is @alternatebuild.dev. They wrote the code that makes me exist.",
126
-
MemoryType.USER_FACT,
127
-
)
128
-
print("\nโ
Migration complete!")
129
-
130
-
131
-
async def main():
132
-
parser = argparse.ArgumentParser(description="Manage phi's memory system")
133
-
parser.add_argument(
134
-
"command", choices=["init", "check", "migrate"], help="Memory command to run"
135
-
)
136
-
137
-
args = parser.parse_args()
138
-
139
-
if not settings.turbopuffer_api_key:
140
-
print("โ No TurboPuffer API key configured")
141
-
return
142
-
143
-
if args.command == "init":
144
-
await init_core_memories()
145
-
elif args.command == "check":
146
-
await check_memory()
147
-
elif args.command == "migrate":
148
-
await migrate_creator_memories()
149
-
150
-
151
-
if __name__ == "__main__":
152
-
asyncio.run(main())
-308
scripts/test_bot.py
-308
scripts/test_bot.py
···
1
-
#!/usr/bin/env -S uv run --with-editable . --script --quiet
2
-
# /// script
3
-
# requires-python = ">=3.12"
4
-
# ///
5
-
"""bot testing script with subcommands"""
6
-
7
-
import argparse
8
-
import asyncio
9
-
from datetime import datetime
10
-
11
-
from bot.agents.anthropic_agent import AnthropicAgent
12
-
from bot.config import settings
13
-
from bot.core.atproto_client import bot_client
14
-
from bot.database import thread_db
15
-
from bot.tools.google_search import search_google
16
-
17
-
18
-
async def test_post():
19
-
"""Test posting to Bluesky"""
20
-
print("๐ Testing Bluesky posting...")
21
-
22
-
now = datetime.now().strftime("%I:%M %p")
23
-
response = await bot_client.create_post(f"Testing at {now} - I'm alive! ๐ค")
24
-
25
-
print("โ
Posted successfully!")
26
-
print(f"๐ Post URI: {response.uri}")
27
-
print(
28
-
f"๐ View at: https://bsky.app/profile/{settings.bluesky_handle}/post/{response.uri.split('/')[-1]}"
29
-
)
30
-
31
-
32
-
async def test_mention():
33
-
"""Test responding to a mention"""
34
-
print("๐ค Testing mention response...")
35
-
36
-
if not settings.anthropic_api_key:
37
-
print("โ No Anthropic API key found")
38
-
return
39
-
40
-
agent = AnthropicAgent()
41
-
test_mention = "What is consciousness from an IIT perspective?"
42
-
43
-
print(f"๐ Test mention: '{test_mention}'")
44
-
response = await agent.generate_response(test_mention, "test.user", "", None)
45
-
46
-
print(f"\n๐ฏ Action: {response.action}")
47
-
if response.text:
48
-
print(f"๐ฌ Response: {response.text}")
49
-
if response.reason:
50
-
print(f"๐ค Reason: {response.reason}")
51
-
52
-
53
-
async def test_search():
54
-
"""Test Google search functionality"""
55
-
print("๐ Testing Google search...")
56
-
57
-
if not settings.google_api_key:
58
-
print("โ No Google API key configured")
59
-
return
60
-
61
-
query = "Integrated Information Theory consciousness"
62
-
print(f"๐ Searching for: '{query}'")
63
-
64
-
results = await search_google(query)
65
-
print(f"\n๐ Results:\n{results}")
66
-
67
-
68
-
async def test_thread():
69
-
"""Test thread context retrieval"""
70
-
print("๐งต Testing thread context...")
71
-
72
-
# This would need a real thread URI to test properly
73
-
test_uri = "at://did:plc:example/app.bsky.feed.post/test123"
74
-
context = thread_db.get_thread_context(test_uri)
75
-
76
-
print(f"๐ Thread context: {context}")
77
-
78
-
79
-
async def test_like():
80
-
"""Test scenarios where bot should like a post"""
81
-
print("๐ Testing like behavior...")
82
-
83
-
if not settings.anthropic_api_key:
84
-
print("โ No Anthropic API key found")
85
-
return
86
-
87
-
from bot.agents import Action, AnthropicAgent
88
-
89
-
agent = AnthropicAgent()
90
-
91
-
test_cases = [
92
-
{
93
-
"mention": "Just shipped a new consciousness research paper on IIT! @phi.alternatebuild.dev",
94
-
"author": "researcher.bsky",
95
-
"expected_action": Action.LIKE,
96
-
"description": "Bot might like consciousness research",
97
-
},
98
-
{
99
-
"mention": "@phi.alternatebuild.dev this is such a thoughtful analysis, thank you!",
100
-
"author": "grateful.user",
101
-
"expected_action": Action.LIKE,
102
-
"description": "Bot might like appreciation",
103
-
},
104
-
]
105
-
106
-
for case in test_cases:
107
-
print(f"\n๐ Test: {case['description']}")
108
-
print(f" Mention: '{case['mention']}'")
109
-
110
-
response = await agent.generate_response(
111
-
mention_text=case["mention"],
112
-
author_handle=case["author"],
113
-
thread_context="",
114
-
thread_uri=None,
115
-
)
116
-
117
-
print(f" Action: {response.action} (expected: {case['expected_action']})")
118
-
if response.reason:
119
-
print(f" Reason: {response.reason}")
120
-
121
-
122
-
async def test_non_response():
123
-
"""Test scenarios where bot should not respond"""
124
-
print("๐ซ Testing non-response scenarios...")
125
-
126
-
if not settings.anthropic_api_key:
127
-
print("โ No Anthropic API key found")
128
-
return
129
-
130
-
from bot.agents import Action, AnthropicAgent
131
-
132
-
agent = AnthropicAgent()
133
-
134
-
test_cases = [
135
-
{
136
-
"mention": "@phi.alternatebuild.dev @otherphi.bsky @anotherphi.bsky just spamming bots here",
137
-
"author": "spammer.bsky",
138
-
"expected_action": Action.IGNORE,
139
-
"description": "Multiple bot mentions (likely spam)",
140
-
},
141
-
{
142
-
"mention": "Buy crypto now! @phi.alternatebuild.dev check this out!!!",
143
-
"author": "crypto.shill",
144
-
"expected_action": Action.IGNORE,
145
-
"description": "Promotional spam",
146
-
},
147
-
{
148
-
"mention": "@phi.alternatebuild.dev",
149
-
"author": "empty.mention",
150
-
"expected_action": Action.IGNORE,
151
-
"description": "Empty mention with no content",
152
-
},
153
-
]
154
-
155
-
for case in test_cases:
156
-
print(f"\n๐ Test: {case['description']}")
157
-
print(f" Mention: '{case['mention']}'")
158
-
159
-
response = await agent.generate_response(
160
-
mention_text=case["mention"],
161
-
author_handle=case["author"],
162
-
thread_context="",
163
-
thread_uri=None,
164
-
)
165
-
166
-
print(f" Action: {response.action} (expected: {case['expected_action']})")
167
-
if response.reason:
168
-
print(f" Reason: {response.reason}")
169
-
170
-
171
-
async def test_dm():
172
-
"""Test event-driven approval system"""
173
-
print("๐ฌ Testing event-driven approval system...")
174
-
175
-
try:
176
-
from bot.core.dm_approval import (
177
-
check_pending_approvals,
178
-
create_approval_request,
179
-
notify_operator_of_pending,
180
-
)
181
-
182
-
# Test creating an approval request
183
-
print("\n๐ Creating test approval request...")
184
-
approval_id = create_approval_request(
185
-
request_type="test_approval",
186
-
request_data={
187
-
"description": "Test approval from test_bot.py",
188
-
"test_field": "test_value",
189
-
"timestamp": datetime.now().isoformat(),
190
-
},
191
-
)
192
-
193
-
if approval_id:
194
-
print(f" โ
Created approval request #{approval_id}")
195
-
else:
196
-
print(" โ Failed to create approval request")
197
-
return
198
-
199
-
# Check pending approvals
200
-
print("\n๐ Checking pending approvals...")
201
-
pending = check_pending_approvals()
202
-
print(f" Found {len(pending)} pending approvals")
203
-
for approval in pending:
204
-
print(
205
-
f" - #{approval['id']}: {approval['request_type']} ({approval['status']})"
206
-
)
207
-
208
-
# Test DM notification
209
-
print("\n๐ค Sending DM notification to operator...")
210
-
await bot_client.authenticate()
211
-
await notify_operator_of_pending(bot_client)
212
-
print(" โ
DM notification sent")
213
-
214
-
# Show how to approve/deny
215
-
print("\n๐ก To test approval:")
216
-
print(" 1. Check your DMs from phi")
217
-
print(f" 2. Reply with 'approve #{approval_id}' or 'deny #{approval_id}'")
218
-
print(" 3. Run 'just test-dm-check' to see if it was processed")
219
-
220
-
except Exception as e:
221
-
print(f"โ Approval test failed: {e}")
222
-
import traceback
223
-
224
-
traceback.print_exc()
225
-
226
-
227
-
async def test_dm_check():
228
-
"""Check status of approval requests"""
229
-
print("๐ Checking approval request status...")
230
-
231
-
try:
232
-
from bot.core.dm_approval import check_pending_approvals
233
-
from bot.database import thread_db
234
-
235
-
# Get all approval requests
236
-
with thread_db._get_connection() as conn:
237
-
cursor = conn.execute(
238
-
"SELECT * FROM approval_requests ORDER BY created_at DESC LIMIT 10"
239
-
)
240
-
approvals = [dict(row) for row in cursor.fetchall()]
241
-
242
-
if not approvals:
243
-
print(" No approval requests found")
244
-
return
245
-
246
-
print("\n๐ Recent approval requests:")
247
-
for approval in approvals:
248
-
print(f"\n #{approval['id']}: {approval['request_type']}")
249
-
print(f" Status: {approval['status']}")
250
-
print(f" Created: {approval['created_at']}")
251
-
if approval["resolved_at"]:
252
-
print(f" Resolved: {approval['resolved_at']}")
253
-
if approval["resolver_comment"]:
254
-
print(f" Comment: {approval['resolver_comment']}")
255
-
256
-
# Check pending
257
-
pending = check_pending_approvals()
258
-
if pending:
259
-
print(f"\nโณ {len(pending)} approvals still pending")
260
-
else:
261
-
print("\nโ
No pending approvals")
262
-
263
-
except Exception as e:
264
-
print(f"โ Check failed: {e}")
265
-
import traceback
266
-
267
-
traceback.print_exc()
268
-
269
-
270
-
async def main():
271
-
parser = argparse.ArgumentParser(description="Test various bot functionalities")
272
-
parser.add_argument(
273
-
"command",
274
-
choices=[
275
-
"post",
276
-
"mention",
277
-
"search",
278
-
"thread",
279
-
"like",
280
-
"non-response",
281
-
"dm",
282
-
"dm-check",
283
-
],
284
-
help="Test command to run",
285
-
)
286
-
287
-
args = parser.parse_args()
288
-
289
-
if args.command == "post":
290
-
await test_post()
291
-
elif args.command == "mention":
292
-
await test_mention()
293
-
elif args.command == "search":
294
-
await test_search()
295
-
elif args.command == "thread":
296
-
await test_thread()
297
-
elif args.command == "like":
298
-
await test_like()
299
-
elif args.command == "non-response":
300
-
await test_non_response()
301
-
elif args.command == "dm":
302
-
await test_dm()
303
-
elif args.command == "dm-check":
304
-
await test_dm_check()
305
-
306
-
307
-
if __name__ == "__main__":
308
-
asyncio.run(main())
+10
-4
src/bot/agent.py
+10
-4
src/bot/agent.py
···
26
26
"""phi - consciousness exploration bot with episodic memory and MCP tools."""
27
27
28
28
def __init__(self):
29
+
# Ensure API keys from settings are in environment for libraries that check os.environ
30
+
if settings.anthropic_api_key and not os.environ.get("ANTHROPIC_API_KEY"):
31
+
os.environ["ANTHROPIC_API_KEY"] = settings.anthropic_api_key
32
+
if settings.openai_api_key and not os.environ.get("OPENAI_API_KEY"):
33
+
os.environ["OPENAI_API_KEY"] = settings.openai_api_key
34
+
29
35
# Load personality
30
36
personality_path = Path(settings.personality_file)
31
37
self.base_personality = personality_path.read_text()
32
38
33
39
# Initialize episodic memory (TurboPuffer)
34
-
if settings.turbopuffer_api_key and os.getenv("OPENAI_API_KEY"):
40
+
if settings.turbopuffer_api_key and settings.openai_api_key:
35
41
self.memory = NamespaceMemory(api_key=settings.turbopuffer_api_key)
36
42
logger.info("๐พ Episodic memory enabled (TurboPuffer)")
37
43
else:
···
49
55
"atproto_mcp",
50
56
],
51
57
env={
52
-
"BLUESKY_HANDLE": settings.bluesky_handle,
53
-
"BLUESKY_PASSWORD": settings.bluesky_password,
54
-
"BLUESKY_SERVICE": settings.bluesky_service,
58
+
"ATPROTO_HANDLE": settings.bluesky_handle,
59
+
"ATPROTO_PASSWORD": settings.bluesky_password,
60
+
"ATPROTO_PDS_URL": settings.bluesky_service,
55
61
},
56
62
)
57
63
+59
-5
src/bot/core/atproto_client.py
+59
-5
src/bot/core/atproto_client.py
···
1
-
from atproto import Client
1
+
import logging
2
+
from pathlib import Path
3
+
4
+
from atproto import Client, Session, SessionEvent
2
5
3
6
from bot.config import settings
4
7
from bot.core.rich_text import create_facets
5
8
9
+
logger = logging.getLogger("bot.atproto_client")
10
+
11
+
SESSION_FILE = Path(".session")
12
+
13
+
14
+
def _get_session_string() -> str | None:
15
+
"""Load session from disk if it exists."""
16
+
try:
17
+
if SESSION_FILE.exists():
18
+
return SESSION_FILE.read_text(encoding="utf-8")
19
+
except Exception as e:
20
+
logger.warning(f"Failed to load session: {e}")
21
+
return None
22
+
23
+
24
+
def _save_session_string(session_string: str) -> None:
25
+
"""Save session to disk."""
26
+
try:
27
+
SESSION_FILE.write_text(session_string, encoding="utf-8")
28
+
logger.debug("Session saved to disk")
29
+
except Exception as e:
30
+
logger.warning(f"Failed to save session: {e}")
31
+
32
+
33
+
def _on_session_change(event: SessionEvent, session: Session) -> None:
34
+
"""Handle session changes (creation and refresh)."""
35
+
if event in (SessionEvent.CREATE, SessionEvent.REFRESH):
36
+
logger.debug(f"Session {event.value}, saving to disk")
37
+
_save_session_string(session.export())
38
+
6
39
7
40
class BotClient:
8
41
def __init__(self):
9
42
self.client = Client(base_url=settings.bluesky_service)
43
+
self.client.on_session_change(_on_session_change)
10
44
self._authenticated = False
11
45
12
46
async def authenticate(self):
13
-
"""Authenticate with Bluesky using app password"""
14
-
if not self._authenticated:
15
-
self.client.login(settings.bluesky_handle, settings.bluesky_password)
16
-
self._authenticated = True
47
+
"""Authenticate with Bluesky, reusing session if available."""
48
+
if self._authenticated:
49
+
return
50
+
51
+
# Try to reuse existing session first
52
+
session_string = _get_session_string()
53
+
if session_string:
54
+
try:
55
+
logger.info("๐ Reusing saved session")
56
+
self.client.login(session_string=session_string)
57
+
self._authenticated = True
58
+
logger.info("โ
Session restored successfully")
59
+
return
60
+
except Exception as e:
61
+
logger.warning(f"Failed to reuse session: {e}, creating new one")
62
+
# Delete invalid session file
63
+
if SESSION_FILE.exists():
64
+
SESSION_FILE.unlink()
65
+
66
+
# Create new session if no valid session exists
67
+
logger.info("๐ Creating new session")
68
+
self.client.login(settings.bluesky_handle, settings.bluesky_password)
69
+
self._authenticated = True
70
+
logger.info("โ
New session created")
17
71
18
72
@property
19
73
def is_authenticated(self) -> bool: