README.md at main · zzstoatzz.io/bot

zzstoatzz.io / bot
a digital entity named phi that roams bsky
bot / README.md
at main 7.2 kB view raw view rendered
  1# phi
  2
  3a bluesky bot inspired by [integrated information theory](https://en.wikipedia.org/wiki/Integrated_information_theory). built with `pydantic-ai`, `mcp`, and the [at protocol](https://atproto.com).
  4
  5## quick start
  6
  7```bash
  8# clone and install
  9git clone https://github.com/zzstoatzz/bot
 10cd bot
 11uv sync
 12
 13# configure
 14cp .env.example .env
 15# edit .env with your credentials
 16
 17# run
 18just run
 19```
 20
 21**required env vars:**
 22- `BLUESKY_HANDLE` / `BLUESKY_PASSWORD` - bot account (use app password)
 23- `ANTHROPIC_API_KEY` - for agent responses
 24
 25**optional (for episodic memory):**
 26- `TURBOPUFFER_API_KEY` + `OPENAI_API_KEY` - semantic memory
 27
 28## features
 29
 30- ✅ responds to mentions with ai-powered messages
 31- ✅ episodic memory with semantic search (turbopuffer)
 32- ✅ thread-aware conversations (fetches from network, not cached)
 33- ✅ mcp-enabled (atproto tools via stdio)
 34- ✅ session persistence (no rate limit issues)
 35- ✅ behavioral test suite with llm-as-judge
 36
 37**→ [read the docs](docs/)** for deeper dive into design and implementation
 38
 39## development
 40
 41```bash
 42just run        # run bot
 43just dev        # run with hot-reload
 44just evals      # run behavioral tests
 45just check      # lint + typecheck + test
 46just fmt        # format code
 47```
 48
 49<details>
 50<summary>architecture</summary>
 51
 52phi is an **mcp-enabled agent** with **episodic memory**:
 53
 54```
 55┌─────────────────────────────────────┐
 56│     Notification Arrives            │
 57└──────────────┬──────────────────────┘
 58               ↓
 59┌─────────────────────────────────────┐
 60│     PhiAgent (PydanticAI)           │
 61│  ┌───────────────────────────────┐  │
 62│  │ System Prompt: personality.md │  │
 63│  └───────────────────────────────┘  │
 64│              ↓                      │
 65│  ┌───────────────────────────────┐  │
 66│  │ Context Building:             │  │
 67│  │ • Thread context (ATProto)    │  │
 68│  │ • Episodic memory (TurboPuffer)│ │
 69│  │   - Semantic search           │  │
 70│  │   - User-specific memories    │  │
 71│  └───────────────────────────────┘  │
 72│              ↓                      │
 73│  ┌───────────────────────────────┐  │
 74│  │ Tools (MCP):                  │  │
 75│  │ • post() - create posts       │  │
 76│  │ • like() - like content       │  │
 77│  │ • repost() - share content    │  │
 78│  │ • follow() - follow users     │  │
 79│  └───────────────────────────────┘  │
 80│              ↓                      │
 81│  ┌───────────────────────────────┐  │
 82│  │ Structured Output:            │  │
 83│  │ Response(action, text, reason)│  │
 84│  └───────────────────────────────┘  │
 85└─────────────────────────────────────┘
 86               ↓
 87┌─────────────────────────────────────┐
 88│     MessageHandler                  │
 89│     Executes action                 │
 90└─────────────────────────────────────┘
 91```
 92
 93**key components:**
 94
 95- **pydantic-ai agent** - loads personality, connects to mcp server, manages memory
 96- **episodic memory** - turbopuffer for vector storage with semantic search
 97- **mcp integration** - external atproto server provides bluesky tools via stdio
 98- **session persistence** - tokens saved to `.session`, auto-refresh every ~2h
 99
100</details>
101
102<details>
103<summary>episodic memory</summary>
104
105phi uses turbopuffer for episodic memory with semantic search.
106
107**namespaces:**
108- `phi-core` - personality, guidelines
109- `phi-users-{handle}` - per-user conversation history
110
111**how it works:**
1121. retrieves relevant memories using semantic search
1132. embeds using openai's text-embedding-3-small
1143. stores user messages and bot responses
1154. references past conversations in future interactions
116
117**why vector storage?**
118- semantic similarity (can't do this with sql)
119- contextual retrieval based on current conversation
120- enables more natural, context-aware interactions
121
122</details>
123
124<details>
125<summary>project structure</summary>
126
127```
128src/bot/
129├── agent.py                    # mcp-enabled agent
130├── config.py                   # configuration
131├── database.py                 # thread history storage
132├── main.py                     # fastapi app
133├── core/
134│   ├── atproto_client.py      # at protocol client (session persistence)
135│   ├── profile_manager.py     # online/offline status
136│   └── rich_text.py           # text formatting
137├── memory/
138│   └── namespace_memory.py    # turbopuffer episodic memory
139└── services/
140    ├── message_handler.py     # agent orchestration
141    └── notification_poller.py # mention polling
142
143evals/                         # behavioral tests
144personalities/                 # personality definitions
145sandbox/                       # docs and analysis
146```
147
148</details>
149
150<details>
151<summary>troubleshooting</summary>
152
153**bot gives no responses?**
154- check `ANTHROPIC_API_KEY` in `.env`
155- restart after changing `.env`
156
157**not seeing mentions?**
158- verify `BLUESKY_HANDLE` and `BLUESKY_PASSWORD`
159- use app password, not main password
160
161**no episodic memory?**
162- check both `TURBOPUFFER_API_KEY` and `OPENAI_API_KEY` are set
163- watch logs for "💾 episodic memory enabled"
164
165**hit bluesky rate limit?**
166- phi uses session persistence to avoid this
167- first run: creates `.session` file with tokens
168- subsequent runs: reuses tokens (no api call)
169- tokens auto-refresh every ~2h
170- only re-authenticates after ~2 months
171- rate limits (10/day per ip, 300/day per account) shouldn't be an issue
172
173</details>
174
175<details>
176<summary>refactor notes</summary>
177
178see `sandbox/MCP_REFACTOR_SUMMARY.md` for details.
179
180**what changed:**
181- removed approval system (half-baked)
182- removed context viz ui (not core)
183- removed google search (can add back via mcp)
184- **kept turbopuffer** (essential for episodic memory)
185- added mcp-based architecture
186- added session persistence
187- reduced codebase by ~2,720 lines
188
189</details>
190
191## reference projects
192
193inspired by [void](https://tangled.sh/@cameron.pfiffer.org/void.git), [penelope](https://github.com/haileyok/penelope), and [prefect-mcp-server](https://github.com/PrefectHQ/prefect-mcp-server).