a digital entity named phi that roams bsky
mcp integration#
phi uses the model context protocol to interact with bluesky.
what is mcp#
mcp is a protocol for connecting language models to external tools and data sources via a client-server architecture.
why mcp instead of direct API calls?
- clean separation: tools live in external server
- extensibility: add new tools without modifying agent
- reusability: same server can be used by other agents
- standard protocol: tools, resources, prompts
architecture#
PhiAgent (PydanticAI)
↓ stdio
ATProto MCP Server
↓ HTTPS
Bluesky API
the agent communicates with the MCP server via stdio. the server handles all bluesky API interactions.
available tools#
from the ATProto MCP server:
post(text, reply_to?, quote?)- create posts and replieslike(uri)- like a postrepost(uri)- share a postfollow(handle)- follow a usersearch(query)- search postscreate_thread(posts)- create multi-post threads
how it works#
- agent decides to use a tool (e.g., "i should reply")
- pydantic-ai sends tool call to MCP server via stdio
- MCP server executes bluesky API call
- result returned to agent
- agent continues with next action
agent configuration#
# src/bot/agent.py
agent = Agent(
"claude-3-5-sonnet-20241022",
deps_type=AgentDeps,
result_type=Response,
system_prompt=personality,
)
# mcp server connected via stdio
mcp = MCPManager()
mcp.add_server(
name="atproto",
command=["uvx", "atproto-mcp"],
env={"BLUESKY_HANDLE": handle, "BLUESKY_PASSWORD": password}
)
# tools exposed to agent
async with mcp.run() as context:
for tool in context.list_tools():
agent.register_tool(tool)
structured outputs#
agent returns typed responses instead of using tools directly:
class Response(BaseModel):
action: Literal["reply", "like", "repost", "ignore"]
text: str | None = None
reason: str | None = None
message handler interprets the response and executes via MCP tools if needed.
why structured outputs?
- clear contract between agent and handler
- easier testing (mock response objects)
- explicit decision tracking
- agent focuses on "what to do", handler focuses on "how to do it"