+18
-21
languages/python/README.md
+18
-21
languages/python/README.md
···
1
1
# python
2
2
3
-
notes on python patterns - tooling, project structure, and building things.
3
+
assumes 3.12+.
4
4
5
-
## topics
5
+
## language
6
6
7
-
- [uv](./uv.md) - cargo for python
8
-
- [project setup](./project-setup.md) - src/ layout, pyproject.toml, justfile
9
-
- [pydantic-settings](./pydantic-settings.md) - centralized typed configuration
10
-
- [tooling](./tooling.md) - ruff, ty, pre-commit
11
-
- [mcp](./mcp.md) - building MCP servers with fastmcp
7
+
- [typing](./language/typing.md)
8
+
- [async](./language/async.md)
9
+
- [patterns](./language/patterns.md)
12
10
13
-
## sources
11
+
## ecosystem
14
12
15
-
patterns derived from building:
13
+
- [uv](./ecosystem/uv.md)
14
+
- [project setup](./ecosystem/project-setup.md)
15
+
- [tooling](./ecosystem/tooling.md)
16
+
- [pydantic](./ecosystem/pydantic.md)
17
+
- [mcp](./ecosystem/mcp.md)
16
18
17
-
| project | what it is |
18
-
|---------|------------|
19
-
| [pdsx](https://github.com/zzstoatzz/pdsx) | ATProto MCP server and CLI |
20
-
| [raggy](https://github.com/zzstoatzz/raggy) | document loaders for LLMs |
21
-
| [prefect-pack](https://github.com/zzstoatzz/prefect-pack) | prefect utilities |
22
-
| [plyr.fm](https://github.com/zzstoatzz/plyr.fm) | music on atproto |
19
+
## sources
23
20
24
-
and studying:
25
-
26
-
| project | what it is |
27
-
|---------|------------|
28
-
| [fastmcp](https://github.com/jlowin/fastmcp) | MCP server framework |
29
-
| [docket](https://github.com/chrisguidry/docket) | distributed task system |
21
+
| project | notes |
22
+
|---------|-------|
23
+
| [pdsx](https://github.com/zzstoatzz/pdsx) | async patterns, MCP, CLI |
24
+
| [plyr-python-client](https://github.com/zzstoatzz/plyr-python-client) | multi-package workspace |
25
+
| [fastmcp](https://github.com/jlowin/fastmcp) | decorator patterns, generics |
26
+
| [docket](https://github.com/chrisguidry/docket) | dependency injection, async lifecycle |
+189
languages/python/ecosystem/mcp.md
+189
languages/python/ecosystem/mcp.md
···
1
+
# mcp
2
+
3
+
MCP (Model Context Protocol) lets you build tools that LLMs can use. fastmcp makes this straightforward.
4
+
5
+
## what MCP is
6
+
7
+
MCP servers expose:
8
+
- **tools** - functions LLMs can call (actions, side effects)
9
+
- **resources** - read-only data (like GET endpoints)
10
+
- **prompts** - reusable message templates
11
+
12
+
clients (like Claude) discover and call these over stdio or HTTP.
13
+
14
+
## basic server
15
+
16
+
```python
17
+
from fastmcp import FastMCP
18
+
19
+
mcp = FastMCP("my-server")
20
+
21
+
@mcp.tool
22
+
def add(a: int, b: int) -> int:
23
+
"""Add two numbers."""
24
+
return a + b
25
+
26
+
@mcp.resource("config://version")
27
+
def get_version() -> str:
28
+
return "1.0.0"
29
+
30
+
if __name__ == "__main__":
31
+
mcp.run()
32
+
```
33
+
34
+
fastmcp generates JSON schemas from type hints and docstrings automatically.
35
+
36
+
## running
37
+
38
+
```bash
39
+
# stdio (default, for local tools)
40
+
python server.py
41
+
42
+
# http (for deployment)
43
+
fastmcp run server.py --transport http --port 8000
44
+
```
45
+
46
+
## tools vs resources
47
+
48
+
**tools** do things:
49
+
```python
50
+
@mcp.tool
51
+
async def create_post(text: str) -> dict:
52
+
"""Create a new post."""
53
+
return await api.create(text)
54
+
```
55
+
56
+
**resources** read things:
57
+
```python
58
+
@mcp.resource("posts://{post_id}")
59
+
async def get_post(post_id: str) -> dict:
60
+
"""Get a post by ID."""
61
+
return await api.get(post_id)
62
+
```
63
+
64
+
## context
65
+
66
+
access MCP capabilities within tools:
67
+
68
+
```python
69
+
from fastmcp import FastMCP, Context
70
+
71
+
mcp = FastMCP("server")
72
+
73
+
@mcp.tool
74
+
async def process(uri: str, ctx: Context) -> str:
75
+
await ctx.info(f"Processing {uri}...")
76
+
data = await ctx.read_resource(uri)
77
+
await ctx.report_progress(50, 100)
78
+
return data
79
+
```
80
+
81
+
## middleware
82
+
83
+
add authentication or other cross-cutting concerns:
84
+
85
+
```python
86
+
from fastmcp import FastMCP
87
+
from fastmcp.server.middleware import Middleware
88
+
89
+
class AuthMiddleware(Middleware):
90
+
async def on_call_tool(self, context, call_next):
91
+
# extract auth from headers, set context state
92
+
return await call_next(context)
93
+
94
+
mcp = FastMCP("server")
95
+
mcp.add_middleware(AuthMiddleware())
96
+
```
97
+
98
+
## decorator patterns
99
+
100
+
add parameters dynamically (from pdsx):
101
+
102
+
```python
103
+
import inspect
104
+
from functools import wraps
105
+
106
+
def filterable(fn):
107
+
"""Add a _filter parameter for JMESPath filtering."""
108
+
@wraps(fn)
109
+
async def wrapper(*args, _filter: str | None = None, **kwargs):
110
+
result = await fn(*args, **kwargs)
111
+
if _filter:
112
+
import jmespath
113
+
return jmespath.search(_filter, result)
114
+
return result
115
+
116
+
# modify signature to include new param
117
+
sig = inspect.signature(fn)
118
+
params = list(sig.parameters.values())
119
+
params.append(inspect.Parameter(
120
+
"_filter",
121
+
inspect.Parameter.KEYWORD_ONLY,
122
+
default=None,
123
+
annotation=str | None,
124
+
))
125
+
wrapper.__signature__ = sig.replace(parameters=params)
126
+
return wrapper
127
+
128
+
@mcp.tool
129
+
@filterable
130
+
async def list_records(collection: str) -> list[dict]:
131
+
...
132
+
```
133
+
134
+
## response size protection
135
+
136
+
LLMs have context limits. protect against flooding:
137
+
138
+
```python
139
+
MAX_RESPONSE_CHARS = 30000
140
+
141
+
def truncate_response(records: list) -> list:
142
+
import json
143
+
serialized = json.dumps(records)
144
+
if len(serialized) <= MAX_RESPONSE_CHARS:
145
+
return records
146
+
# truncate and add message about using _filter
147
+
...
148
+
```
149
+
150
+
## claude code plugins
151
+
152
+
structure for Claude Code integration:
153
+
154
+
```
155
+
.claude-plugin/
156
+
├── plugin.json # plugin definition
157
+
└── marketplace.json # marketplace metadata
158
+
159
+
skills/
160
+
└── domain/
161
+
└── SKILL.md # contextual guidance
162
+
```
163
+
164
+
**plugin.json**:
165
+
```json
166
+
{
167
+
"name": "myserver",
168
+
"description": "what it does",
169
+
"mcpServers": "./.mcp.json"
170
+
}
171
+
```
172
+
173
+
skills are markdown files loaded as context when relevant to the task.
174
+
175
+
## entry points
176
+
177
+
expose both CLI and MCP server:
178
+
179
+
```toml
180
+
[project.scripts]
181
+
mytool = "mytool.cli:main"
182
+
mytool-mcp = "mytool.mcp:main"
183
+
```
184
+
185
+
sources:
186
+
- [fastmcp](https://github.com/jlowin/fastmcp)
187
+
- [pdsx](https://github.com/zzstoatzz/pdsx)
188
+
- [prefect-mcp-server-demo](https://github.com/zzstoatzz/prefect-mcp-server-demo)
189
+
- [gofastmcp.com](https://gofastmcp.com)
+181
languages/python/ecosystem/project-setup.md
+181
languages/python/ecosystem/project-setup.md
···
1
+
# project setup
2
+
3
+
consistent structure across projects: src/ layout, pyproject.toml as single source of truth, justfile for commands.
4
+
5
+
## directory structure
6
+
7
+
```
8
+
myproject/
9
+
├── src/myproject/
10
+
│ ├── __init__.py
11
+
│ ├── cli.py
12
+
│ ├── settings.py
13
+
│ └── _internal/ # private implementation
14
+
├── tests/
15
+
├── pyproject.toml
16
+
├── justfile
17
+
└── .pre-commit-config.yaml
18
+
```
19
+
20
+
the `src/` layout prevents accidental imports from the working directory. your package is only importable when properly installed.
21
+
22
+
## pyproject.toml
23
+
24
+
```toml
25
+
[project]
26
+
name = "myproject"
27
+
description = "what it does"
28
+
readme = "README.md"
29
+
requires-python = ">=3.10"
30
+
dynamic = ["version"]
31
+
dependencies = [
32
+
"httpx>=0.27",
33
+
"pydantic>=2.0",
34
+
]
35
+
36
+
[project.scripts]
37
+
myproject = "myproject.cli:main"
38
+
39
+
[build-system]
40
+
requires = ["hatchling", "uv-dynamic-versioning"]
41
+
build-backend = "hatchling.build"
42
+
43
+
[tool.hatch.version]
44
+
source = "uv-dynamic-versioning"
45
+
46
+
[tool.uv-dynamic-versioning]
47
+
vcs = "git"
48
+
style = "pep440"
49
+
bump = true
50
+
fallback-version = "0.0.0"
51
+
52
+
[dependency-groups]
53
+
dev = [
54
+
"pytest>=8.0",
55
+
"ruff>=0.8",
56
+
"ty>=0.0.1a6",
57
+
]
58
+
```
59
+
60
+
key patterns:
61
+
- `dynamic = ["version"]` - version comes from git tags, not manual editing
62
+
- `[project.scripts]` - CLI entry points
63
+
64
+
## dependency groups vs optional dependencies
65
+
66
+
these look similar but serve different purposes.
67
+
68
+
**dependency groups** (PEP 735) are local-only. they never appear in published package metadata. users who `pip install` your package won't see them:
69
+
70
+
```toml
71
+
[dependency-groups]
72
+
dev = ["pytest", "ruff"]
73
+
docs = ["mkdocs", "mkdocs-material"]
74
+
```
75
+
76
+
install with `uv sync --group dev`. CI can install only what it needs.
77
+
78
+
**optional dependencies** are published in package metadata. users can install them:
79
+
80
+
```toml
81
+
[project.optional-dependencies]
82
+
aws = ["prefect-aws"]
83
+
mcp = ["fastmcp>=2.0"]
84
+
```
85
+
86
+
install with `pip install mypackage[aws]` or `uv add 'mypackage[mcp]'`.
87
+
88
+
use groups for dev/test/CI. use optional deps for features consumers might want.
89
+
90
+
from [switching a big python library from setup.py to pyproject.toml](https://blog.zzstoatzz.io/switching-a-big-python-library-from-setuppy-to-pyprojecttoml/)
91
+
92
+
## versioning from git tags
93
+
94
+
with `uv-dynamic-versioning`, your version is derived from git:
95
+
96
+
```bash
97
+
git tag v0.1.0
98
+
git push --tags
99
+
```
100
+
101
+
no more editing `__version__` or `pyproject.toml` for releases.
102
+
103
+
## justfile
104
+
105
+
```makefile
106
+
check-uv:
107
+
#!/usr/bin/env sh
108
+
if ! command -v uv >/dev/null 2>&1; then
109
+
echo "uv is not installed. Install: curl -LsSf https://astral.sh/uv/install.sh | sh"
110
+
exit 1
111
+
fi
112
+
113
+
install: check-uv
114
+
uv sync
115
+
116
+
test:
117
+
uv run pytest tests/ -xvs
118
+
119
+
lint:
120
+
uv run ruff format src/ tests/
121
+
uv run ruff check src/ tests/ --fix
122
+
123
+
check:
124
+
uv run ty check
125
+
```
126
+
127
+
run with `just test`, `just lint`, etc.
128
+
129
+
## multiple entry points
130
+
131
+
for projects with both CLI and MCP server:
132
+
133
+
```toml
134
+
[project.scripts]
135
+
myproject = "myproject.cli:main"
136
+
myproject-mcp = "myproject.mcp:main"
137
+
```
138
+
139
+
## uv workspaces
140
+
141
+
for multi-package repos (like plyr-python-client):
142
+
143
+
```
144
+
myproject/
145
+
├── packages/
146
+
│ ├── core/
147
+
│ │ ├── src/core/
148
+
│ │ └── pyproject.toml
149
+
│ └── mcp/
150
+
│ ├── src/mcp/
151
+
│ └── pyproject.toml
152
+
├── pyproject.toml # root workspace config
153
+
└── uv.lock
154
+
```
155
+
156
+
root pyproject.toml:
157
+
158
+
```toml
159
+
[tool.uv.workspace]
160
+
members = ["packages/*"]
161
+
162
+
[tool.uv.sources]
163
+
core = { workspace = true }
164
+
mcp = { workspace = true }
165
+
```
166
+
167
+
packages can depend on each other. one lockfile for the whole workspace.
168
+
169
+
## simpler build backend
170
+
171
+
for projects that don't need dynamic versioning, `uv_build` is lighter:
172
+
173
+
```toml
174
+
[build-system]
175
+
requires = ["uv_build>=0.9.2,<0.10.0"]
176
+
build-backend = "uv_build"
177
+
```
178
+
179
+
sources:
180
+
- [pdsx/pyproject.toml](https://github.com/zzstoatzz/pdsx/blob/main/pyproject.toml)
181
+
- [plyr-python-client](https://github.com/zzstoatzz/plyr-python-client)
+134
languages/python/ecosystem/pydantic.md
+134
languages/python/ecosystem/pydantic.md
···
1
+
# pydantic
2
+
3
+
pydantic is a library, not the language. but it's become foundational enough that it's worth understanding.
4
+
5
+
## the core idea
6
+
7
+
python's type hints don't do anything at runtime by default. `def foo(x: int)` accepts strings, floats, whatever - the annotation is just documentation.
8
+
9
+
pydantic makes them real. define a model with type hints, and pydantic validates and coerces data to match:
10
+
11
+
```python
12
+
from pydantic import BaseModel
13
+
14
+
class User(BaseModel):
15
+
name: str
16
+
age: int
17
+
18
+
user = User(name="alice", age="25") # age coerced to int
19
+
user = User(name="alice", age="not a number") # raises ValidationError
20
+
```
21
+
22
+
this is why pydantic shows up everywhere in python - it bridges the gap between python's dynamic runtime and the desire for validated, typed data.
23
+
24
+
## settings from environment
25
+
26
+
the most common use: replacing `os.getenv()` calls with validated configuration.
27
+
28
+
```python
29
+
from pydantic import Field
30
+
from pydantic_settings import BaseSettings, SettingsConfigDict
31
+
32
+
class Settings(BaseSettings):
33
+
"""settings for atproto cli."""
34
+
35
+
model_config = SettingsConfigDict(
36
+
env_file=str(Path.cwd() / ".env"),
37
+
env_file_encoding="utf-8",
38
+
extra="ignore",
39
+
case_sensitive=False,
40
+
)
41
+
42
+
atproto_pds_url: str = Field(
43
+
default="https://bsky.social",
44
+
description="PDS URL",
45
+
)
46
+
atproto_handle: str = Field(default="", description="Your atproto handle")
47
+
atproto_password: str = Field(default="", description="Your atproto app password")
48
+
49
+
settings = Settings()
50
+
```
51
+
52
+
`model_config` controls where settings come from (environment, .env files) and how to handle unknowns. required fields without defaults fail at import time - not later when you try to use them.
53
+
54
+
from [pdsx/_internal/config.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/_internal/config.py)
55
+
56
+
## annotated types for reusable validation
57
+
58
+
when multiple schemas share the same validation logic, bind it to the type itself instead of repeating `@field_validator` on each schema:
59
+
60
+
```python
61
+
from datetime import timedelta
62
+
from typing import Annotated
63
+
from pydantic import AfterValidator, BaseModel
64
+
65
+
def _validate_non_negative_timedelta(v: timedelta) -> timedelta:
66
+
if v < timedelta(seconds=0):
67
+
raise ValueError("timedelta must be non-negative")
68
+
return v
69
+
70
+
NonNegativeTimedelta = Annotated[
71
+
timedelta,
72
+
AfterValidator(_validate_non_negative_timedelta)
73
+
]
74
+
75
+
class RunDeployment(BaseModel):
76
+
schedule_after: NonNegativeTimedelta
77
+
```
78
+
79
+
benefits:
80
+
- write validation once
81
+
- field types become swappable interfaces
82
+
- types are self-documenting
83
+
84
+
from [coping with python's type system](https://blog.zzstoatzz.io/coping-with-python-type-system/)
85
+
86
+
## model_validator for side effects
87
+
88
+
run setup code when settings load:
89
+
90
+
```python
91
+
from typing import Self
92
+
from pydantic import model_validator
93
+
from pydantic_settings import BaseSettings
94
+
95
+
class Settings(BaseSettings):
96
+
debug: bool = False
97
+
98
+
@model_validator(mode="after")
99
+
def configure_logging(self) -> Self:
100
+
setup_logging(debug=self.debug)
101
+
return self
102
+
103
+
settings = Settings() # logging configured on import
104
+
```
105
+
106
+
the validator runs after all fields are set. use for side effects that depend on configuration values.
107
+
108
+
from [bot/config.py](https://github.com/zzstoatzz/bot)
109
+
110
+
## when to use what
111
+
112
+
pydantic models are heavier than they look - they do a lot of work on instantiation. for internal data you control, python's `dataclasses` are simpler:
113
+
114
+
```python
115
+
from dataclasses import dataclass
116
+
117
+
@dataclass
118
+
class BatchResult:
119
+
"""result of a batch operation."""
120
+
successful: list[str]
121
+
failed: list[tuple[str, Exception]]
122
+
123
+
@property
124
+
def total(self) -> int:
125
+
return len(self.successful) + len(self.failed)
126
+
```
127
+
128
+
no validation, no coercion, just a class with fields. use pydantic at boundaries (API input, config files, external data) where you need validation. use dataclasses for internal data structures.
129
+
130
+
from [pdsx/_internal/batch.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/_internal/batch.py)
131
+
132
+
sources:
133
+
- [how to use pydantic-settings](https://blog.zzstoatzz.io/how-to-use-pydantic-settings/)
134
+
- [coping with python's type system](https://blog.zzstoatzz.io/coping-with-python-type-system/)
+126
languages/python/ecosystem/tooling.md
+126
languages/python/ecosystem/tooling.md
···
1
+
# tooling
2
+
3
+
ruff for linting and formatting. ty for type checking. pre-commit to enforce both.
4
+
5
+
## ruff
6
+
7
+
replaces black, isort, flake8, and dozens of plugins. one tool, fast.
8
+
9
+
```bash
10
+
uv run ruff format src/ tests/ # format
11
+
uv run ruff check src/ tests/ # lint
12
+
uv run ruff check --fix # lint and auto-fix
13
+
```
14
+
15
+
### pyproject.toml config
16
+
17
+
```toml
18
+
[tool.ruff]
19
+
line-length = 88
20
+
21
+
[tool.ruff.lint]
22
+
fixable = ["ALL"]
23
+
extend-select = [
24
+
"I", # isort
25
+
"B", # flake8-bugbear
26
+
"C4", # flake8-comprehensions
27
+
"UP", # pyupgrade
28
+
"SIM", # flake8-simplify
29
+
"RUF", # ruff-specific
30
+
]
31
+
ignore = [
32
+
"COM812", # conflicts with formatter
33
+
]
34
+
35
+
[tool.ruff.lint.per-file-ignores]
36
+
"__init__.py" = ["F401", "I001"] # unused imports ok in __init__
37
+
"tests/**/*.py" = ["S101"] # assert ok in tests
38
+
```
39
+
40
+
## ty
41
+
42
+
astral's new type checker. still early but fast and improving.
43
+
44
+
```bash
45
+
uv run ty check
46
+
```
47
+
48
+
### pyproject.toml config
49
+
50
+
```toml
51
+
[tool.ty.src]
52
+
include = ["src", "tests"]
53
+
exclude = ["**/node_modules", "**/__pycache__", ".venv"]
54
+
55
+
[tool.ty.environment]
56
+
python-version = "3.10"
57
+
58
+
[tool.ty.rules]
59
+
# start permissive, tighten over time
60
+
unknown-argument = "ignore"
61
+
no-matching-overload = "ignore"
62
+
```
63
+
64
+
## pre-commit
65
+
66
+
enforce standards before commits reach the repo.
67
+
68
+
### .pre-commit-config.yaml
69
+
70
+
```yaml
71
+
repos:
72
+
- repo: https://github.com/abravalheri/validate-pyproject
73
+
rev: v0.24.1
74
+
hooks:
75
+
- id: validate-pyproject
76
+
77
+
- repo: https://github.com/astral-sh/ruff-pre-commit
78
+
rev: v0.8.0
79
+
hooks:
80
+
- id: ruff-check
81
+
args: [--fix, --exit-non-zero-on-fix]
82
+
- id: ruff-format
83
+
84
+
- repo: local
85
+
hooks:
86
+
- id: type-check
87
+
name: type check
88
+
entry: uv run ty check
89
+
language: system
90
+
types: [python]
91
+
pass_filenames: false
92
+
93
+
- repo: https://github.com/pre-commit/pre-commit-hooks
94
+
rev: v5.0.0
95
+
hooks:
96
+
- id: no-commit-to-branch
97
+
args: [--branch, main]
98
+
```
99
+
100
+
install with:
101
+
102
+
```bash
103
+
uv run pre-commit install
104
+
```
105
+
106
+
never use `--no-verify` to skip hooks. fix the issue instead.
107
+
108
+
## pytest
109
+
110
+
```toml
111
+
[tool.pytest.ini_options]
112
+
asyncio_mode = "auto"
113
+
asyncio_default_fixture_loop_scope = "function"
114
+
testpaths = ["tests"]
115
+
```
116
+
117
+
`asyncio_mode = "auto"` means async tests just work - no `@pytest.mark.asyncio` needed.
118
+
119
+
## alternatives
120
+
121
+
- [typsht](https://github.com/zzstoatzz/typsht) - parallel type checking across multiple checkers
122
+
- [prek](https://github.com/zzstoatzz/prek) - pre-commit reimplemented in rust
123
+
124
+
sources:
125
+
- [pdsx/.pre-commit-config.yaml](https://github.com/zzstoatzz/pdsx/blob/main/.pre-commit-config.yaml)
126
+
- [pdsx/pyproject.toml](https://github.com/zzstoatzz/pdsx/blob/main/pyproject.toml)
+126
languages/python/ecosystem/uv.md
+126
languages/python/ecosystem/uv.md
···
1
+
# uv
2
+
3
+
uv isn't "faster pip." it's cargo for python - a unified toolchain that changes what's practical to do.
4
+
5
+
## install
6
+
7
+
```bash
8
+
# macOS/Linux
9
+
curl -LsSf https://astral.sh/uv/install.sh | sh
10
+
11
+
# Windows
12
+
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
13
+
```
14
+
15
+
## the commands you actually use
16
+
17
+
```bash
18
+
uv sync # install deps from pyproject.toml
19
+
uv run pytest # run in project environment
20
+
uv add httpx # add a dependency
21
+
uvx ruff check # run a tool without installing it
22
+
```
23
+
24
+
never use `uv pip`. that's the escape hatch, not the workflow.
25
+
26
+
## zero-setup environments
27
+
28
+
run tools without installing anything:
29
+
30
+
```bash
31
+
uvx flask --help
32
+
uvx ruff check .
33
+
uvx pytest
34
+
```
35
+
36
+
this creates an ephemeral environment, runs the tool, done. no virtualenv activation, no pip install.
37
+
38
+
## the repro pattern
39
+
40
+
testing specific versions without polluting your environment:
41
+
42
+
```bash
43
+
# test against a specific version
44
+
uv run --with 'pydantic==2.11.4' repro.py
45
+
46
+
# test a git branch before it's released
47
+
uv run --with pydantic@git+https://github.com/pydantic/pydantic.git@fix-branch repro.py
48
+
49
+
# combine: released package + unreleased fix
50
+
uv run --with prefect==3.1.3 --with pydantic@git+https://github.com/pydantic/pydantic.git@fix repro.py
51
+
```
52
+
53
+
for monorepos with subdirectories:
54
+
55
+
```bash
56
+
uv run --with git+https://github.com/prefecthq/prefect.git@branch#subdirectory=src/integrations/prefect-redis repro.py
57
+
```
58
+
59
+
## inline script metadata
60
+
61
+
PEP 723 lets you embed dependencies directly in a script:
62
+
63
+
```python
64
+
# /// script
65
+
# dependencies = ["httpx", "rich"]
66
+
# requires-python = ">=3.12"
67
+
# ///
68
+
69
+
import httpx
70
+
from rich import print
71
+
72
+
print(httpx.get("https://httpbin.org/get").json())
73
+
```
74
+
75
+
run with `uv run script.py` - uv reads the metadata and creates an environment with those dependencies. no pyproject.toml, no requirements.txt, just a self-contained script.
76
+
77
+
this is how you share reproducible examples. put the dependencies in the file itself, and anyone with uv can run it.
78
+
79
+
## shareable one-liners
80
+
81
+
no file needed:
82
+
83
+
```bash
84
+
uv run --with 'httpx==0.27.0' python -c 'import httpx; print(httpx.get("https://httpbin.org/get").json())'
85
+
```
86
+
87
+
share in github issues, slack, anywhere. anyone with uv can run it.
88
+
89
+
## stdin execution
90
+
91
+
pipe code directly:
92
+
93
+
```bash
94
+
echo 'import sys; print(sys.version)' | uv run -
95
+
pbpaste | uv run --with pandas -
96
+
```
97
+
98
+
## project workflow
99
+
100
+
```bash
101
+
uv init myproject # create new project
102
+
cd myproject
103
+
uv add httpx pydantic # add deps
104
+
uv sync # install everything
105
+
uv run python main.py # run in environment
106
+
```
107
+
108
+
`uv sync` reads `pyproject.toml` and `uv.lock`, installs exactly what's specified.
109
+
110
+
## why this matters
111
+
112
+
the old way:
113
+
1. install python (which version?)
114
+
2. create virtualenv
115
+
3. activate it (did you remember?)
116
+
4. pip install (hope versions resolve)
117
+
5. run your code
118
+
119
+
the uv way:
120
+
1. `uv run your_code.py`
121
+
122
+
uv handles python versions, environments, and dependencies implicitly. you stop thinking about environment management.
123
+
124
+
sources:
125
+
- [but really, what's so good about uv???](https://blog.zzstoatzz.io/but-really-whats-so-good-about-uv/)
126
+
- [running list of repros via uv](https://blog.zzstoatzz.io/running-list-of-repros-via-uv/)
+125
languages/python/language/async.md
+125
languages/python/language/async.md
···
1
+
# async
2
+
3
+
python's `async`/`await` syntax is straightforward. the interesting part is how you structure code around it.
4
+
5
+
## async with
6
+
7
+
the core insight from async python codebases: `async with` is how you manage resources. not try/finally, not callbacks - the context manager protocol.
8
+
9
+
when you open a connection, start a session, or acquire any resource that needs cleanup, you wrap it in an async context manager:
10
+
11
+
```python
12
+
@asynccontextmanager
13
+
async def get_atproto_client(
14
+
require_auth: bool = False,
15
+
operation: str = "this operation",
16
+
target_repo: str | None = None,
17
+
) -> AsyncIterator[AsyncClient]:
18
+
"""get an atproto client using credentials from context or environment."""
19
+
client = AsyncClient(pds_url)
20
+
if require_auth and handle and password:
21
+
await client.login(handle, password)
22
+
try:
23
+
yield client
24
+
finally:
25
+
pass # AsyncClient doesn't need explicit cleanup
26
+
```
27
+
28
+
the caller writes `async with get_atproto_client() as client:` and cleanup happens automatically. this pattern appears constantly - database connections, HTTP sessions, file handles, locks.
29
+
30
+
from [pdsx/mcp/client.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/mcp/client.py)
31
+
32
+
the alternative - manual try/finally blocks scattered through the code, or worse, forgetting cleanup entirely - is why this pattern dominates. you encode the lifecycle once in the context manager, and every use site gets it right by default.
33
+
34
+
## ContextVar
35
+
36
+
python added `contextvars` to solve a specific problem: how do you have request-scoped state in async code without passing it through every function?
37
+
38
+
in sync code, you might use thread-locals. but async tasks can interleave on the same thread, so thread-locals don't work. `ContextVar` gives each task its own copy:
39
+
40
+
```python
41
+
from contextvars import ContextVar
42
+
43
+
_current_docket: ContextVar[Docket | None] = ContextVar("docket", default=None)
44
+
_current_worker: ContextVar[Worker | None] = ContextVar("worker", default=None)
45
+
_current_server: ContextVar[weakref.ref[FastMCP] | None] = ContextVar("server", default=None)
46
+
```
47
+
48
+
set it at the start of handling a request, and any code called from that task can access it. this is how frameworks like fastapi and fastmcp pass request context without threading it through every function signature.
49
+
50
+
the pattern: set at the boundary (request handler, task entry), read anywhere inside. reset when you're done.
51
+
52
+
from [fastmcp/server/dependencies.py](https://github.com/jlowin/fastmcp/blob/main/src/fastmcp/server/dependencies.py)
53
+
54
+
## concurrency control
55
+
56
+
`asyncio.gather()` runs tasks concurrently, but sometimes you need to limit how many run at once - rate limits, connection pools, memory constraints.
57
+
58
+
`asyncio.Semaphore` is the primitive for this. acquire before work, release after. the `async with` syntax makes it clean:
59
+
60
+
```python
61
+
semaphore = asyncio.Semaphore(concurrency)
62
+
63
+
async def delete_one(uri: str) -> None:
64
+
"""delete a single record with concurrency control."""
65
+
async with semaphore:
66
+
try:
67
+
await delete_record(client, uri)
68
+
successful.append(uri)
69
+
except Exception as e:
70
+
failed.append((uri, e))
71
+
if fail_fast:
72
+
raise
73
+
74
+
await asyncio.gather(*[delete_one(uri) for uri in uris])
75
+
```
76
+
77
+
at most `concurrency` delete operations run at once. the rest wait.
78
+
79
+
from [pdsx/_internal/batch.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/_internal/batch.py)
80
+
81
+
## connection pools
82
+
83
+
module-level singleton pool, lazily initialized:
84
+
85
+
```python
86
+
_pool: asyncpg.Pool | None = None
87
+
88
+
async def get_pool() -> asyncpg.Pool:
89
+
global _pool
90
+
if _pool is None:
91
+
_pool = await asyncpg.create_pool(db_url, min_size=2, max_size=10)
92
+
return _pool
93
+
94
+
@asynccontextmanager
95
+
async def get_conn() -> AsyncGenerator[asyncpg.Connection, None]:
96
+
pool = await get_pool()
97
+
async with pool.acquire() as conn:
98
+
yield conn
99
+
```
100
+
101
+
callers use `async with get_conn() as conn:` - pool handles connection lifecycle.
102
+
103
+
## batch writes with unnest
104
+
105
+
postgres `unnest()` turns arrays into rows. one round trip for thousands of inserts:
106
+
107
+
```python
108
+
async def batch_upsert_follows(follows: list[tuple[str, str, str]]) -> None:
109
+
follower_ids = [f[0] for f in follows]
110
+
rkeys = [f[1] for f in follows]
111
+
subject_ids = [f[2] for f in follows]
112
+
113
+
async with get_conn() as conn:
114
+
await conn.execute(
115
+
"""
116
+
INSERT INTO follows (follower_id, rkey, subject_id)
117
+
SELECT * FROM unnest($1::bigint[], $2::text[], $3::bigint[])
118
+
ON CONFLICT (follower_id, rkey) DO UPDATE
119
+
SET subject_id = EXCLUDED.subject_id
120
+
""",
121
+
follower_ids, rkeys, subject_ids,
122
+
)
123
+
```
124
+
125
+
from [follower-weight/db.py](https://github.com/zzstoatzz/follower-weight)
+238
languages/python/language/patterns.md
+238
languages/python/language/patterns.md
···
1
+
# patterns
2
+
3
+
class design, decorators, error handling, and other structural patterns.
4
+
5
+
## private internals
6
+
7
+
keep implementation details in `_internal/`:
8
+
9
+
```
10
+
src/mypackage/
11
+
├── __init__.py # public API
12
+
├── cli.py # entry point
13
+
└── _internal/ # implementation details
14
+
├── config.py
15
+
├── operations.py
16
+
└── types.py
17
+
```
18
+
19
+
`__init__.py` re-exports the public interface. users import from the package, not from internals.
20
+
21
+
## dataclasses for DTOs
22
+
23
+
simple data containers that don't need validation:
24
+
25
+
```python
26
+
from dataclasses import dataclass
27
+
28
+
@dataclass
29
+
class BatchResult:
30
+
successful: list[str]
31
+
failed: list[tuple[str, Exception]]
32
+
33
+
@property
34
+
def total(self) -> int:
35
+
return len(self.successful) + len(self.failed)
36
+
```
37
+
38
+
lighter than pydantic when you control the data source.
39
+
40
+
from [pdsx/_internal/batch.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/_internal/batch.py)
41
+
42
+
## base classes with parallel implementations
43
+
44
+
when you need both sync and async:
45
+
46
+
```python
47
+
class _BaseClient:
48
+
def __init__(self, *, token: str | None = None):
49
+
self._token = token or get_settings().token
50
+
51
+
class Client(_BaseClient):
52
+
def get(self, url: str) -> dict:
53
+
return httpx.get(url, headers=self._headers).json()
54
+
55
+
class AsyncClient(_BaseClient):
56
+
async def get(self, url: str) -> dict:
57
+
async with httpx.AsyncClient() as client:
58
+
return (await client.get(url, headers=self._headers)).json()
59
+
```
60
+
61
+
shared logic in base, divergent implementation in subclasses.
62
+
63
+
## fluent interfaces
64
+
65
+
chainable methods that return `self`:
66
+
67
+
```python
68
+
class Track:
69
+
def __init__(self, source: str):
70
+
self._source = source
71
+
self._effects: list[str] = []
72
+
73
+
def volume(self, level: float) -> "Track":
74
+
self._effects.append(f"volume={level}")
75
+
return self
76
+
77
+
def lowpass(self, freq: float) -> "Track":
78
+
self._effects.append(f"lowpass=f={freq}")
79
+
return self
80
+
81
+
# usage
82
+
track.volume(0.8).lowpass(600).fade_in(0.5)
83
+
```
84
+
85
+
## factory classmethods
86
+
87
+
alternative constructors:
88
+
89
+
```python
90
+
@dataclass
91
+
class URIParts:
92
+
"""parsed components of an AT-URI."""
93
+
repo: str
94
+
collection: str
95
+
rkey: str
96
+
97
+
@classmethod
98
+
def from_uri(cls, uri: str, client_did: str | None = None) -> URIParts:
99
+
"""parse an AT-URI into its components."""
100
+
uri_without_prefix = uri.replace("at://", "")
101
+
parts = uri_without_prefix.split("/")
102
+
103
+
# shorthand format: collection/rkey
104
+
if len(parts) == 2:
105
+
if not client_did:
106
+
raise ValueError("shorthand URI requires authentication")
107
+
return cls(repo=client_did, collection=parts[0], rkey=parts[1])
108
+
109
+
# full format: did/collection/rkey
110
+
if len(parts) == 3:
111
+
return cls(repo=parts[0], collection=parts[1], rkey=parts[2])
112
+
113
+
raise ValueError(f"invalid URI format: {uri}")
114
+
```
115
+
116
+
from [pdsx/_internal/resolution.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/_internal/resolution.py)
117
+
118
+
## keyword-only arguments
119
+
120
+
force callers to name arguments for clarity:
121
+
122
+
```python
123
+
def batch_create(
124
+
client: Client,
125
+
collection: str,
126
+
records: list[dict],
127
+
*, # everything after is keyword-only
128
+
concurrency: int = 10,
129
+
fail_fast: bool = False,
130
+
) -> BatchResult:
131
+
...
132
+
133
+
# must use: batch_create(client, "posts", items, concurrency=5)
134
+
# not: batch_create(client, "posts", items, 5)
135
+
```
136
+
137
+
## custom exceptions
138
+
139
+
with helpful context:
140
+
141
+
```python
142
+
class AuthenticationRequired(Exception):
143
+
def __init__(self, operation: str = "this operation"):
144
+
super().__init__(
145
+
f"{operation} requires authentication.\n\n"
146
+
"Set ATPROTO_HANDLE and ATPROTO_PASSWORD environment variables."
147
+
)
148
+
```
149
+
150
+
the message tells you what to do, not just what went wrong.
151
+
152
+
from [pdsx/mcp/client.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/mcp/client.py)
153
+
154
+
## exception hierarchies
155
+
156
+
for structured error handling:
157
+
158
+
```python
159
+
class AppError(Exception):
160
+
"""base for all app errors."""
161
+
162
+
class ValidationError(AppError):
163
+
"""input validation failed."""
164
+
165
+
class ResourceError(AppError):
166
+
"""resource operation failed."""
167
+
168
+
# callers can catch AppError for all, or specific types
169
+
```
170
+
171
+
## argparse for CLIs
172
+
173
+
argparse is stdlib. subparsers with aliases give you unix-style commands:
174
+
175
+
```python
176
+
parser = argparse.ArgumentParser(
177
+
description="my tool",
178
+
formatter_class=argparse.RawDescriptionHelpFormatter,
179
+
)
180
+
parser.add_argument("-r", "--repo", help="target repo")
181
+
182
+
subparsers = parser.add_subparsers(dest="command")
183
+
184
+
list_parser = subparsers.add_parser("list", aliases=["ls"])
185
+
list_parser.add_argument("collection")
186
+
list_parser.add_argument("--limit", type=int, default=50)
187
+
```
188
+
189
+
dispatch with tuple membership to handle aliases:
190
+
191
+
```python
192
+
args = parser.parse_args()
193
+
194
+
if not args.command:
195
+
parser.print_help()
196
+
return 1
197
+
198
+
if args.command in ("list", "ls"):
199
+
return await cmd_list(args.collection, args.limit)
200
+
```
201
+
202
+
async entry point pattern:
203
+
204
+
```python
205
+
async def async_main() -> int:
206
+
# argparse and dispatch here
207
+
return 0
208
+
209
+
def main() -> NoReturn:
210
+
sys.exit(asyncio.run(async_main()))
211
+
```
212
+
213
+
source: [pdsx/src/pdsx/cli.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/cli.py)
214
+
215
+
## module-level singletons
216
+
217
+
instantiate once, import everywhere:
218
+
219
+
```python
220
+
# config.py
221
+
from functools import lru_cache
222
+
223
+
@lru_cache
224
+
def get_settings() -> Settings:
225
+
return Settings()
226
+
227
+
settings = get_settings()
228
+
229
+
# console.py
230
+
from rich.console import Console
231
+
232
+
console = Console()
233
+
```
234
+
235
+
sources:
236
+
- [pdsx](https://github.com/zzstoatzz/pdsx)
237
+
- [plyr-python-client](https://github.com/zzstoatzz/plyr-python-client)
238
+
- [docket](https://github.com/chrisguidry/docket)
+187
languages/python/language/typing.md
+187
languages/python/language/typing.md
···
1
+
# typing
2
+
3
+
notes on type hints as actually used in projects like pdsx and fastmcp.
4
+
5
+
## unions
6
+
7
+
use `|` for unions, not `Optional`:
8
+
9
+
```python
10
+
RecordValue = str | int | float | bool | None | dict[str, Any] | list[Any]
11
+
```
12
+
13
+
from [pdsx/_internal/types.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/_internal/types.py)
14
+
15
+
## TypedDict
16
+
17
+
for structured dictionaries where you know the shape:
18
+
19
+
```python
20
+
from typing import TypedDict
21
+
22
+
class RecordResponse(TypedDict):
23
+
"""a record returned from list or get operations."""
24
+
uri: str
25
+
cid: str | None
26
+
value: dict
27
+
28
+
class CredentialsContext(TypedDict):
29
+
"""credentials extracted from context or headers."""
30
+
handle: str | None
31
+
password: str | None
32
+
pds_url: str | None
33
+
repo: str | None
34
+
```
35
+
36
+
better than `dict[str, Any]` because the structure is documented and checked.
37
+
38
+
from [pdsx/mcp/_types.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/mcp/_types.py)
39
+
40
+
## Annotated
41
+
42
+
attach metadata to types. useful for documentation and schema generation:
43
+
44
+
```python
45
+
from typing import Annotated
46
+
from pydantic import Field
47
+
48
+
FilterParam = Annotated[
49
+
str | None,
50
+
Field(
51
+
description=(
52
+
"jmespath expression to filter/project the result. "
53
+
"examples: '[*].{uri: uri, text: value.text}' (select fields), "
54
+
"'[?value.text != null]' (filter items), "
55
+
"'[*].uri' (extract values)"
56
+
),
57
+
),
58
+
]
59
+
```
60
+
61
+
the metadata travels with the type. MCP tools use this for parameter descriptions.
62
+
63
+
from [pdsx/mcp/filterable.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/mcp/filterable.py)
64
+
65
+
## Protocol
66
+
67
+
define what methods something needs, not what class it is:
68
+
69
+
```python
70
+
from typing import Protocol
71
+
72
+
class ContextSamplingFallbackProtocol(Protocol):
73
+
async def __call__(
74
+
self,
75
+
messages: str | list[str | SamplingMessage],
76
+
system_prompt: str | None = None,
77
+
temperature: float | None = None,
78
+
max_tokens: int | None = None,
79
+
) -> ContentBlock: ...
80
+
```
81
+
82
+
any callable matching this signature satisfies the protocol. no inheritance required.
83
+
84
+
from [fastmcp/utilities/types.py](https://github.com/jlowin/fastmcp/blob/main/src/fastmcp/utilities/types.py)
85
+
86
+
## generics
87
+
88
+
TypeVar for generic functions and classes:
89
+
90
+
```python
91
+
from typing import ParamSpec, TypeVar
92
+
93
+
P = ParamSpec("P")
94
+
R = TypeVar("R")
95
+
96
+
def filterable(
97
+
fn: Callable[P, R] | Callable[P, Awaitable[R]],
98
+
) -> Callable[P, Any] | Callable[P, Awaitable[Any]]:
99
+
...
100
+
```
101
+
102
+
`ParamSpec` captures the full signature (args and kwargs) for decorator typing.
103
+
104
+
from [pdsx/mcp/filterable.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/mcp/filterable.py)
105
+
106
+
## ParamSpec
107
+
108
+
for decorators that preserve function signatures:
109
+
110
+
```python
111
+
@wraps(fn)
112
+
async def async_wrapper(
113
+
*args: P.args, _filter: str | None = None, **kwargs: P.kwargs
114
+
) -> Any:
115
+
result = await fn(*args, **kwargs)
116
+
return apply_filter(result, _filter)
117
+
```
118
+
119
+
`P.args` and `P.kwargs` carry the original function's parameter types into the wrapper. type checkers see the wrapper with the same signature as the wrapped function (plus any new parameters).
120
+
121
+
from [pdsx/mcp/filterable.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/mcp/filterable.py)
122
+
123
+
## overload
124
+
125
+
when return type depends on input type:
126
+
127
+
```python
128
+
from typing import overload
129
+
130
+
@overload
131
+
def filterable(
132
+
fn: Callable[P, Awaitable[R]],
133
+
) -> Callable[P, Awaitable[Any]]: ...
134
+
135
+
@overload
136
+
def filterable(
137
+
fn: Callable[P, R],
138
+
) -> Callable[P, Any]: ...
139
+
140
+
def filterable(
141
+
fn: Callable[P, R] | Callable[P, Awaitable[R]],
142
+
) -> Callable[P, Any] | Callable[P, Awaitable[Any]]:
143
+
...
144
+
```
145
+
146
+
type checkers know async functions get async wrappers, sync functions get sync wrappers.
147
+
148
+
from [pdsx/mcp/filterable.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/mcp/filterable.py)
149
+
150
+
## TYPE_CHECKING
151
+
152
+
avoid runtime import costs for types only needed for hints:
153
+
154
+
```python
155
+
from typing import TYPE_CHECKING
156
+
157
+
if TYPE_CHECKING:
158
+
from docket import Docket
159
+
from docket.execution import Execution
160
+
from fastmcp.tools.tool_transform import ArgTransform, TransformedTool
161
+
```
162
+
163
+
the import doesn't happen at runtime, only when type checkers analyze the code. with `from __future__ import annotations`, you don't need string quotes.
164
+
165
+
from [fastmcp/tools/tool.py](https://github.com/jlowin/fastmcp/blob/main/src/fastmcp/tools/tool.py)
166
+
167
+
## import organization
168
+
169
+
```python
170
+
"""module docstring."""
171
+
172
+
from __future__ import annotations
173
+
174
+
import stdlib_module
175
+
from typing import TYPE_CHECKING
176
+
177
+
from third_party import thing
178
+
179
+
from local_package import helper
180
+
181
+
if TYPE_CHECKING:
182
+
from expensive import Type
183
+
```
184
+
185
+
sources:
186
+
- [pdsx/src/pdsx/_internal/types.py](https://github.com/zzstoatzz/pdsx/blob/main/src/pdsx/_internal/types.py)
187
+
- [fastmcp/src/fastmcp/tools/tool.py](https://github.com/jlowin/fastmcp/blob/main/src/fastmcp/tools/tool.py)
-188
languages/python/mcp.md
-188
languages/python/mcp.md
···
1
-
# mcp
2
-
3
-
MCP (Model Context Protocol) lets you build tools that LLMs can use. fastmcp makes this straightforward.
4
-
5
-
## what MCP is
6
-
7
-
MCP servers expose:
8
-
- **tools** - functions LLMs can call (actions, side effects)
9
-
- **resources** - read-only data (like GET endpoints)
10
-
- **prompts** - reusable message templates
11
-
12
-
clients (like Claude) discover and call these over stdio or HTTP.
13
-
14
-
## basic server
15
-
16
-
```python
17
-
from fastmcp import FastMCP
18
-
19
-
mcp = FastMCP("my-server")
20
-
21
-
@mcp.tool
22
-
def add(a: int, b: int) -> int:
23
-
"""Add two numbers."""
24
-
return a + b
25
-
26
-
@mcp.resource("config://version")
27
-
def get_version() -> str:
28
-
return "1.0.0"
29
-
30
-
if __name__ == "__main__":
31
-
mcp.run()
32
-
```
33
-
34
-
fastmcp generates JSON schemas from type hints and docstrings automatically.
35
-
36
-
## running
37
-
38
-
```bash
39
-
# stdio (default, for local tools)
40
-
python server.py
41
-
42
-
# http (for deployment)
43
-
fastmcp run server.py --transport http --port 8000
44
-
```
45
-
46
-
## tools vs resources
47
-
48
-
**tools** do things:
49
-
```python
50
-
@mcp.tool
51
-
async def create_post(text: str) -> dict:
52
-
"""Create a new post."""
53
-
return await api.create(text)
54
-
```
55
-
56
-
**resources** read things:
57
-
```python
58
-
@mcp.resource("posts://{post_id}")
59
-
async def get_post(post_id: str) -> dict:
60
-
"""Get a post by ID."""
61
-
return await api.get(post_id)
62
-
```
63
-
64
-
## context
65
-
66
-
access MCP capabilities within tools:
67
-
68
-
```python
69
-
from fastmcp import FastMCP, Context
70
-
71
-
mcp = FastMCP("server")
72
-
73
-
@mcp.tool
74
-
async def process(uri: str, ctx: Context) -> str:
75
-
await ctx.info(f"Processing {uri}...")
76
-
data = await ctx.read_resource(uri)
77
-
await ctx.report_progress(50, 100)
78
-
return data
79
-
```
80
-
81
-
## middleware
82
-
83
-
add authentication or other cross-cutting concerns:
84
-
85
-
```python
86
-
from fastmcp import FastMCP
87
-
from fastmcp.server.middleware import Middleware
88
-
89
-
class AuthMiddleware(Middleware):
90
-
async def on_call_tool(self, context, call_next):
91
-
# extract auth from headers, set context state
92
-
return await call_next(context)
93
-
94
-
mcp = FastMCP("server")
95
-
mcp.add_middleware(AuthMiddleware())
96
-
```
97
-
98
-
## decorator patterns
99
-
100
-
add parameters dynamically (from pdsx):
101
-
102
-
```python
103
-
import inspect
104
-
from functools import wraps
105
-
106
-
def filterable(fn):
107
-
"""Add a _filter parameter for JMESPath filtering."""
108
-
@wraps(fn)
109
-
async def wrapper(*args, _filter: str | None = None, **kwargs):
110
-
result = await fn(*args, **kwargs)
111
-
if _filter:
112
-
import jmespath
113
-
return jmespath.search(_filter, result)
114
-
return result
115
-
116
-
# modify signature to include new param
117
-
sig = inspect.signature(fn)
118
-
params = list(sig.parameters.values())
119
-
params.append(inspect.Parameter(
120
-
"_filter",
121
-
inspect.Parameter.KEYWORD_ONLY,
122
-
default=None,
123
-
annotation=str | None,
124
-
))
125
-
wrapper.__signature__ = sig.replace(parameters=params)
126
-
return wrapper
127
-
128
-
@mcp.tool
129
-
@filterable
130
-
async def list_records(collection: str) -> list[dict]:
131
-
...
132
-
```
133
-
134
-
## response size protection
135
-
136
-
LLMs have context limits. protect against flooding:
137
-
138
-
```python
139
-
MAX_RESPONSE_CHARS = 30000
140
-
141
-
def truncate_response(records: list) -> list:
142
-
import json
143
-
serialized = json.dumps(records)
144
-
if len(serialized) <= MAX_RESPONSE_CHARS:
145
-
return records
146
-
# truncate and add message about using _filter
147
-
...
148
-
```
149
-
150
-
## claude code plugins
151
-
152
-
structure for Claude Code integration:
153
-
154
-
```
155
-
.claude-plugin/
156
-
├── plugin.json # plugin definition
157
-
└── marketplace.json # marketplace metadata
158
-
159
-
skills/
160
-
└── domain/
161
-
└── SKILL.md # contextual guidance
162
-
```
163
-
164
-
**plugin.json**:
165
-
```json
166
-
{
167
-
"name": "myserver",
168
-
"description": "what it does",
169
-
"mcpServers": "./.mcp.json"
170
-
}
171
-
```
172
-
173
-
skills are markdown files loaded as context when relevant to the task.
174
-
175
-
## entry points
176
-
177
-
expose both CLI and MCP server:
178
-
179
-
```toml
180
-
[project.scripts]
181
-
mytool = "mytool.cli:main"
182
-
mytool-mcp = "mytool.mcp:main"
183
-
```
184
-
185
-
sources:
186
-
- [fastmcp](https://github.com/jlowin/fastmcp)
187
-
- [pdsx](https://github.com/zzstoatzz/pdsx)
188
-
- [gofastmcp.com](https://gofastmcp.com)
-125
languages/python/project-setup.md
-125
languages/python/project-setup.md
···
1
-
# project setup
2
-
3
-
consistent structure across projects: src/ layout, pyproject.toml as single source of truth, justfile for commands.
4
-
5
-
## directory structure
6
-
7
-
```
8
-
myproject/
9
-
├── src/myproject/
10
-
│ ├── __init__.py
11
-
│ ├── cli.py
12
-
│ ├── settings.py
13
-
│ └── _internal/ # private implementation
14
-
├── tests/
15
-
├── pyproject.toml
16
-
├── justfile
17
-
└── .pre-commit-config.yaml
18
-
```
19
-
20
-
the `src/` layout prevents accidental imports from the working directory. your package is only importable when properly installed.
21
-
22
-
## pyproject.toml
23
-
24
-
```toml
25
-
[project]
26
-
name = "myproject"
27
-
description = "what it does"
28
-
readme = "README.md"
29
-
requires-python = ">=3.10"
30
-
dynamic = ["version"]
31
-
dependencies = [
32
-
"httpx>=0.27",
33
-
"pydantic>=2.0",
34
-
]
35
-
36
-
[project.scripts]
37
-
myproject = "myproject.cli:main"
38
-
39
-
[build-system]
40
-
requires = ["hatchling", "uv-dynamic-versioning"]
41
-
build-backend = "hatchling.build"
42
-
43
-
[tool.hatch.version]
44
-
source = "uv-dynamic-versioning"
45
-
46
-
[tool.uv-dynamic-versioning]
47
-
vcs = "git"
48
-
style = "pep440"
49
-
bump = true
50
-
fallback-version = "0.0.0"
51
-
52
-
[dependency-groups]
53
-
dev = [
54
-
"pytest>=8.0",
55
-
"ruff>=0.8",
56
-
"ty>=0.0.1a6",
57
-
]
58
-
```
59
-
60
-
key patterns:
61
-
- `dynamic = ["version"]` - version comes from git tags, not manual editing
62
-
- `[dependency-groups]` - dev deps separate from runtime deps
63
-
- `[project.scripts]` - CLI entry points
64
-
65
-
## versioning from git tags
66
-
67
-
with `uv-dynamic-versioning`, your version is derived from git:
68
-
69
-
```bash
70
-
git tag v0.1.0
71
-
git push --tags
72
-
```
73
-
74
-
no more editing `__version__` or `pyproject.toml` for releases.
75
-
76
-
## justfile
77
-
78
-
```makefile
79
-
check-uv:
80
-
#!/usr/bin/env sh
81
-
if ! command -v uv >/dev/null 2>&1; then
82
-
echo "uv is not installed. Install: curl -LsSf https://astral.sh/uv/install.sh | sh"
83
-
exit 1
84
-
fi
85
-
86
-
install: check-uv
87
-
uv sync
88
-
89
-
test:
90
-
uv run pytest tests/ -xvs
91
-
92
-
lint:
93
-
uv run ruff format src/ tests/
94
-
uv run ruff check src/ tests/ --fix
95
-
96
-
check:
97
-
uv run ty check
98
-
```
99
-
100
-
run with `just test`, `just lint`, etc.
101
-
102
-
## multiple entry points
103
-
104
-
for projects with both CLI and MCP server:
105
-
106
-
```toml
107
-
[project.scripts]
108
-
myproject = "myproject.cli:main"
109
-
myproject-mcp = "myproject.mcp:main"
110
-
```
111
-
112
-
## optional dependencies
113
-
114
-
for features that not everyone needs:
115
-
116
-
```toml
117
-
[project.optional-dependencies]
118
-
mcp = ["fastmcp>=2.0"]
119
-
```
120
-
121
-
install with `uv sync --extra mcp` or `uv add 'myproject[mcp]'`.
122
-
123
-
sources:
124
-
- [pdsx/pyproject.toml](https://github.com/zzstoatzz/pdsx/blob/main/pyproject.toml)
125
-
- [raggy/pyproject.toml](https://github.com/zzstoatzz/raggy/blob/main/pyproject.toml)
-110
languages/python/pydantic-settings.md
-110
languages/python/pydantic-settings.md
···
1
-
# pydantic-settings
2
-
3
-
replace scattered `os.getenv()` calls with a typed, validated settings class.
4
-
5
-
## the problem
6
-
7
-
```python
8
-
import os
9
-
10
-
REDIS_HOST = os.getenv("REDIS_HOST")
11
-
REDIS_PORT = os.getenv("REDIS_PORT")
12
-
13
-
if not REDIS_HOST or not REDIS_PORT:
14
-
raise ValueError("REDIS_HOST and REDIS_PORT must be set")
15
-
16
-
# REDIS_PORT is still a string here
17
-
```
18
-
19
-
issues:
20
-
- validation happens where you use the value, not at startup
21
-
- no type coercion (port is a string)
22
-
- configuration scattered across files
23
-
- easy to forget validation
24
-
25
-
## the solution
26
-
27
-
```python
28
-
from pydantic import Field, SecretStr
29
-
from pydantic_settings import BaseSettings, SettingsConfigDict
30
-
31
-
class Settings(BaseSettings):
32
-
model_config = SettingsConfigDict(
33
-
env_file=".env",
34
-
extra="ignore",
35
-
)
36
-
37
-
redis_host: str
38
-
redis_port: int = Field(ge=0)
39
-
openai_api_key: SecretStr
40
-
41
-
settings = Settings()
42
-
```
43
-
44
-
now:
45
-
- missing `REDIS_HOST` fails immediately at import
46
-
- `redis_port` is coerced to int and validated >= 0
47
-
- `openai_api_key` won't accidentally appear in logs
48
-
- all configuration in one place
49
-
50
-
## field aliases
51
-
52
-
when env var names don't match your preferred attribute names:
53
-
54
-
```python
55
-
class Settings(BaseSettings):
56
-
current_user: str = Field(alias="USER")
57
-
```
58
-
59
-
reads from `$USER`, accessed as `settings.current_user`.
60
-
61
-
## secrets
62
-
63
-
`SecretStr` prevents accidental exposure:
64
-
65
-
```python
66
-
class Settings(BaseSettings):
67
-
api_key: SecretStr
68
-
69
-
settings = Settings()
70
-
print(settings.api_key) # SecretStr('**********')
71
-
print(settings.api_key.get_secret_value()) # actual value
72
-
```
73
-
74
-
## contextual serialization
75
-
76
-
when you need to unmask secrets for subprocesses:
77
-
78
-
```python
79
-
from pydantic import Secret, SerializationInfo
80
-
81
-
def maybe_unmask(v: Secret[str], info: SerializationInfo) -> str | Secret[str]:
82
-
if info.context and info.context.get("unmask"):
83
-
return v.get_secret_value()
84
-
return v
85
-
86
-
# usage
87
-
settings.model_dump(context={"unmask": True})
88
-
```
89
-
90
-
## nested settings
91
-
92
-
for larger projects:
93
-
94
-
```python
95
-
class DatabaseSettings(BaseSettings):
96
-
host: str = "localhost"
97
-
port: int = 5432
98
-
99
-
class Settings(BaseSettings):
100
-
database: DatabaseSettings = Field(default_factory=DatabaseSettings)
101
-
```
102
-
103
-
## why fail-fast matters
104
-
105
-
with `os.getenv()`, you find out about missing config when the code path executes - maybe in production, at 2am.
106
-
107
-
with pydantic-settings, invalid configuration fails at startup. deploy fails, not runtime.
108
-
109
-
sources:
110
-
- [how to use pydantic-settings](https://blog.zzstoatzz.io/how-to-use-pydantic-settings/)
-121
languages/python/tooling.md
-121
languages/python/tooling.md
···
1
-
# tooling
2
-
3
-
ruff for linting and formatting. ty for type checking. pre-commit to enforce both.
4
-
5
-
## ruff
6
-
7
-
replaces black, isort, flake8, and dozens of plugins. one tool, fast.
8
-
9
-
```bash
10
-
uv run ruff format src/ tests/ # format
11
-
uv run ruff check src/ tests/ # lint
12
-
uv run ruff check --fix # lint and auto-fix
13
-
```
14
-
15
-
### pyproject.toml config
16
-
17
-
```toml
18
-
[tool.ruff]
19
-
line-length = 88
20
-
21
-
[tool.ruff.lint]
22
-
fixable = ["ALL"]
23
-
extend-select = [
24
-
"I", # isort
25
-
"B", # flake8-bugbear
26
-
"C4", # flake8-comprehensions
27
-
"UP", # pyupgrade
28
-
"SIM", # flake8-simplify
29
-
"RUF", # ruff-specific
30
-
]
31
-
ignore = [
32
-
"COM812", # conflicts with formatter
33
-
]
34
-
35
-
[tool.ruff.lint.per-file-ignores]
36
-
"__init__.py" = ["F401", "I001"] # unused imports ok in __init__
37
-
"tests/**/*.py" = ["S101"] # assert ok in tests
38
-
```
39
-
40
-
## ty
41
-
42
-
astral's new type checker. still early but fast and improving.
43
-
44
-
```bash
45
-
uv run ty check
46
-
```
47
-
48
-
### pyproject.toml config
49
-
50
-
```toml
51
-
[tool.ty.src]
52
-
include = ["src", "tests"]
53
-
exclude = ["**/node_modules", "**/__pycache__", ".venv"]
54
-
55
-
[tool.ty.environment]
56
-
python-version = "3.10"
57
-
58
-
[tool.ty.rules]
59
-
# start permissive, tighten over time
60
-
unknown-argument = "ignore"
61
-
no-matching-overload = "ignore"
62
-
```
63
-
64
-
## pre-commit
65
-
66
-
enforce standards before commits reach the repo.
67
-
68
-
### .pre-commit-config.yaml
69
-
70
-
```yaml
71
-
repos:
72
-
- repo: https://github.com/abravalheri/validate-pyproject
73
-
rev: v0.24.1
74
-
hooks:
75
-
- id: validate-pyproject
76
-
77
-
- repo: https://github.com/astral-sh/ruff-pre-commit
78
-
rev: v0.8.0
79
-
hooks:
80
-
- id: ruff-check
81
-
args: [--fix, --exit-non-zero-on-fix]
82
-
- id: ruff-format
83
-
84
-
- repo: local
85
-
hooks:
86
-
- id: type-check
87
-
name: type check
88
-
entry: uv run ty check
89
-
language: system
90
-
types: [python]
91
-
pass_filenames: false
92
-
93
-
- repo: https://github.com/pre-commit/pre-commit-hooks
94
-
rev: v5.0.0
95
-
hooks:
96
-
- id: no-commit-to-branch
97
-
args: [--branch, main]
98
-
```
99
-
100
-
install with:
101
-
102
-
```bash
103
-
uv run pre-commit install
104
-
```
105
-
106
-
never use `--no-verify` to skip hooks. fix the issue instead.
107
-
108
-
## pytest
109
-
110
-
```toml
111
-
[tool.pytest.ini_options]
112
-
asyncio_mode = "auto"
113
-
asyncio_default_fixture_loop_scope = "function"
114
-
testpaths = ["tests"]
115
-
```
116
-
117
-
`asyncio_mode = "auto"` means async tests just work - no `@pytest.mark.asyncio` needed.
118
-
119
-
sources:
120
-
- [pdsx/.pre-commit-config.yaml](https://github.com/zzstoatzz/pdsx/blob/main/.pre-commit-config.yaml)
121
-
- [raggy/pyproject.toml](https://github.com/zzstoatzz/raggy/blob/main/pyproject.toml)
-106
languages/python/uv.md
-106
languages/python/uv.md
···
1
-
# uv
2
-
3
-
uv isn't "faster pip." it's cargo for python - a unified toolchain that changes what's practical to do.
4
-
5
-
## install
6
-
7
-
```bash
8
-
# macOS/Linux
9
-
curl -LsSf https://astral.sh/uv/install.sh | sh
10
-
11
-
# Windows
12
-
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
13
-
```
14
-
15
-
## the commands you actually use
16
-
17
-
```bash
18
-
uv sync # install deps from pyproject.toml
19
-
uv run pytest # run in project environment
20
-
uv add httpx # add a dependency
21
-
uvx ruff check # run a tool without installing it
22
-
```
23
-
24
-
never use `uv pip`. that's the escape hatch, not the workflow.
25
-
26
-
## zero-setup environments
27
-
28
-
run tools without installing anything:
29
-
30
-
```bash
31
-
uvx flask --help
32
-
uvx ruff check .
33
-
uvx pytest
34
-
```
35
-
36
-
this creates an ephemeral environment, runs the tool, done. no virtualenv activation, no pip install.
37
-
38
-
## the repro pattern
39
-
40
-
testing specific versions without polluting your environment:
41
-
42
-
```bash
43
-
# test against a specific version
44
-
uv run --with 'pydantic==2.11.4' repro.py
45
-
46
-
# test a git branch before it's released
47
-
uv run --with pydantic@git+https://github.com/pydantic/pydantic.git@fix-branch repro.py
48
-
49
-
# combine: released package + unreleased fix
50
-
uv run --with prefect==3.1.3 --with pydantic@git+https://github.com/pydantic/pydantic.git@fix repro.py
51
-
```
52
-
53
-
for monorepos with subdirectories:
54
-
55
-
```bash
56
-
uv run --with git+https://github.com/prefecthq/prefect.git@branch#subdirectory=src/integrations/prefect-redis repro.py
57
-
```
58
-
59
-
## shareable one-liners
60
-
61
-
no file needed:
62
-
63
-
```bash
64
-
uv run --with 'httpx==0.27.0' python -c 'import httpx; print(httpx.get("https://httpbin.org/get").json())'
65
-
```
66
-
67
-
share in github issues, slack, anywhere. anyone with uv can run it.
68
-
69
-
## stdin execution
70
-
71
-
pipe code directly:
72
-
73
-
```bash
74
-
echo 'import sys; print(sys.version)' | uv run -
75
-
pbpaste | uv run --with pandas -
76
-
```
77
-
78
-
## project workflow
79
-
80
-
```bash
81
-
uv init myproject # create new project
82
-
cd myproject
83
-
uv add httpx pydantic # add deps
84
-
uv sync # install everything
85
-
uv run python main.py # run in environment
86
-
```
87
-
88
-
`uv sync` reads `pyproject.toml` and `uv.lock`, installs exactly what's specified.
89
-
90
-
## why this matters
91
-
92
-
the old way:
93
-
1. install python (which version?)
94
-
2. create virtualenv
95
-
3. activate it (did you remember?)
96
-
4. pip install (hope versions resolve)
97
-
5. run your code
98
-
99
-
the uv way:
100
-
1. `uv run your_code.py`
101
-
102
-
uv handles python versions, environments, and dependencies implicitly. you stop thinking about environment management.
103
-
104
-
sources:
105
-
- [but really, what's so good about uv???](https://blog.zzstoatzz.io/but-really-whats-so-good-about-uv/)
106
-
- [running list of repros via uv](https://blog.zzstoatzz.io/running-list-of-repros-via-uv/)
+11
languages/ziglang/0.15/README.md
+11
languages/ziglang/0.15/README.md
···
2
2
3
3
[release notes](https://ziglang.org/download/0.15.1/release-notes.html)
4
4
5
+
## issue tracking
6
+
7
+
zig moved from github to codeberg. old issues (≤25xxx) are on github, new issues (30xxx+) are on codeberg.org/ziglang/zig.
8
+
9
+
query via API:
10
+
```bash
11
+
curl -s "https://codeberg.org/api/v1/repos/ziglang/zig/issues?state=all&q=flate"
12
+
```
13
+
5
14
## notes
6
15
7
16
- [arraylist](./arraylist.md) - ownership patterns (toOwnedSlice vs deinit)
···
9
18
- [build](./build.md) - createModule + imports, hash trick
10
19
- [comptime](./comptime.md) - type generation, tuple synthesis, validation
11
20
- [concurrency](./concurrency.md) - atomics vs mutex, callback pattern
21
+
- [crypto](./crypto.md) - ecdsa paths, signature verification
22
+
- [testing](./testing.md) - zig test vs build test, arena for leaky apis
+44
languages/ziglang/0.15/crypto.md
+44
languages/ziglang/0.15/crypto.md
···
1
+
# crypto
2
+
3
+
zig 0.15 crypto paths and patterns.
4
+
5
+
## ecdsa
6
+
7
+
the ecdsa types are under `std.crypto.sign.ecdsa`, not `std.crypto.ecdsa`:
8
+
9
+
```zig
10
+
const crypto = std.crypto;
11
+
12
+
// correct (0.15)
13
+
const Secp256k1 = crypto.sign.ecdsa.EcdsaSecp256k1Sha256;
14
+
const P256 = crypto.sign.ecdsa.EcdsaP256Sha256;
15
+
16
+
// wrong - will error "has no member named 'ecdsa'"
17
+
// const Secp256k1 = crypto.ecdsa.EcdsaSecp256k1Sha256;
18
+
```
19
+
20
+
## verifying signatures
21
+
22
+
```zig
23
+
const Scheme = std.crypto.sign.ecdsa.EcdsaSecp256k1Sha256;
24
+
25
+
// signature is r || s (64 bytes for 256-bit curves)
26
+
const sig = Scheme.Signature.fromBytes(sig_bytes[0..64].*);
27
+
28
+
// public key from SEC1 compressed format (33 bytes)
29
+
const public_key = try Scheme.PublicKey.fromSec1(key_bytes);
30
+
31
+
// verify - returns error.SignatureVerificationFailed on mismatch
32
+
try sig.verify(message, public_key);
33
+
```
34
+
35
+
## key sizes
36
+
37
+
| curve | compressed pubkey | uncompressed pubkey | signature (r\|\|s) |
38
+
|-------|-------------------|---------------------|-------------------|
39
+
| P-256 | 33 bytes | 65 bytes | 64 bytes |
40
+
| secp256k1 | 33 bytes | 65 bytes | 64 bytes |
41
+
42
+
compressed keys start with 0x02 or 0x03. uncompressed start with 0x04.
43
+
44
+
see: [zat/jwt.zig](https://tangled.sh/@zzstoatzz.io/zat/tree/main/src/internal/jwt.zig)
+15
languages/ziglang/0.15/io.md
+15
languages/ziglang/0.15/io.md
···
77
77
## when you don't need to flush
78
78
79
79
the high-level apis handle this for you. `http.Server`'s `request.respond()` flushes internally. `http.Client` flushes when the request completes. you only need manual flushes when working with raw streams or tls directly.
80
+
81
+
## gzip decompression bug (0.15.x only)
82
+
83
+
http.Client panics when decompressing certain gzip responses on x86_64-linux. the deflate decompressor sets up a Writer with `unreachableRebase` but can hit a code path that calls `rebase` when the buffer fills.
84
+
85
+
**workaround:**
86
+
```zig
87
+
_ = try client.fetch(.{
88
+
.location = .{ .url = url },
89
+
.response_writer = &aw.writer,
90
+
.headers = .{ .accept_encoding = .{ .override = "identity" } },
91
+
});
92
+
```
93
+
94
+
fixed in 0.16. see: [zat/xrpc.zig](https://tangled.sh/zzstoatzz.io/zat/tree/main/src/internal/xrpc.zig#L88)
+16
languages/ziglang/0.15/testing.md
+16
languages/ziglang/0.15/testing.md
···
1
+
# testing
2
+
3
+
`zig build test` compiles tests but may not run them if your build.zig doesn't wire it up correctly. to actually run tests:
4
+
5
+
```bash
6
+
zig test src/foo.zig # runs tests in foo.zig and its imports
7
+
zig test src/foo.zig --test-filter "parse" # only tests matching "parse"
8
+
```
9
+
10
+
the testing allocator catches leaks. if you use `parseFromValueLeaky` or similar "leaky" apis, wrap in an arena:
11
+
12
+
```zig
13
+
var arena = std.heap.ArenaAllocator.init(std.testing.allocator);
14
+
defer arena.deinit();
15
+
const result = try leakyFunction(arena.allocator(), input);
16
+
```
+2
-1
languages/ziglang/README.md
+2
-1
languages/ziglang/README.md
···
4
4
5
5
## topics
6
6
7
-
- [0.15](./0.15/) - version-specific patterns (i/o overhaul, arraylist, concurrency)
7
+
- [0.15](./0.15/) - version-specific patterns (i/o, arraylist, crypto, testing)
8
8
- [build](./build/) - build system patterns from large projects
9
9
10
10
## sources
···
17
17
| [find-bufo](https://tangled.sh/@zzstoatzz.io/find-bufo) | bluesky bot |
18
18
| [leaflet-search](https://tangled.sh/@zzstoatzz.io/leaflet-search) | fts search backend |
19
19
| [zql](https://tangled.sh/@zzstoatzz.io/zql) | comptime sql parsing |
20
+
| [zat](https://tangled.sh/@zzstoatzz.io/zat) | atproto primitives (jwt, crypto) |
20
21
| [ghostty](https://github.com/ghostty-org/ghostty) | terminal emulator (build system) |
21
22
| [bun](https://github.com/oven-sh/bun) | javascript runtime (build system) |
22
23
+10
languages/ziglang/build/dependencies.md
+10
languages/ziglang/build/dependencies.md
···
104
104
105
105
fails gracefully when pkg-config isn't available.
106
106
107
+
## tangled packages
108
+
109
+
tangled.sh hosts zig packages. fetch without `.tar.gz` extension:
110
+
111
+
```bash
112
+
zig fetch --save https://tangled.sh/zzstoatzz.io/zat/archive/main
113
+
```
114
+
115
+
`zig fetch` checks Content-Type headers to determine archive format - tangled handles this server-side.
116
+
107
117
sources:
108
118
- [ghostty/build.zig.zon](https://github.com/ghostty-org/ghostty/blob/main/build.zig.zon)
109
119
- [ghostty/pkg/](https://github.com/ghostty-org/ghostty/tree/main/pkg)
+81
protocols/MCP/README.md
+81
protocols/MCP/README.md
···
1
+
# mcp
2
+
3
+
the model context protocol. an open standard for connecting ai models (hosts) to external systems (servers) via structured tools, resources, and prompts. it acts as a "usb-c port for ai."
4
+
5
+
## architecture
6
+
7
+
mcp defines a client-server relationship:
8
+
9
+
- **host**: the ai application (e.g., claude code, vscode) that coordinates and manages mcp clients.
10
+
- **client**: maintains a dedicated connection to an mcp server and obtains context from it for the host. a host can have multiple clients.
11
+
- **server**: a program that provides context (tools, resources, prompts) to mcp clients. servers can run locally (stdio) or remotely (http/sse).
12
+
13
+
```
14
+
┌─────────────┐ ┌─────────────┐
15
+
│ MCP Host │ │ MCP Server │
16
+
│ (LLM Client)│─────│ (Tools, Data)│
17
+
└──────┬──────┘ └─────────────┘
18
+
│ ▲
19
+
│ request/response│
20
+
│ │
21
+
│ context, actions│
22
+
▼ │
23
+
┌─────────────┐ ┌─────────────┐
24
+
│ MCP Client │─────│ External │
25
+
│ (Per Server)│ │ System │
26
+
└─────────────┘ └─────────────┘
27
+
```
28
+
29
+
## primitives
30
+
31
+
mcp servers expose three core primitives:
32
+
33
+
### tools
34
+
executable functions that the host (via the llm) can invoke.
35
+
- define actions an ai can take.
36
+
- typically correspond to python functions with type hints and docstrings.
37
+
- examples: `add_event_to_calendar(title: str, date: str)`, `search_docs(query: str)`.
38
+
39
+
### resources
40
+
read-only data sources exposed to the host.
41
+
- content is addressed by a uri (e.g., `config://app/settings.json`, `github://repo/readme.md`).
42
+
- can be structured (json) or unstructured (text, binary).
43
+
- examples: application configuration, documentation, database entries.
44
+
45
+
### prompts
46
+
reusable templates for interaction.
47
+
- define common interactions or workflows.
48
+
- can guide the llm in complex tasks.
49
+
- examples: `summarize_document(document: str)`, `generate_report(data: dict)`.
50
+
51
+
## transport
52
+
53
+
mcp supports flexible transport mechanisms:
54
+
- **stdio**: standard input/output. efficient for local, co-located processes.
55
+
- **streamable http**: for remote servers. uses http post for client messages and server-sent events (sse) for streaming responses. supports standard http auth.
56
+
57
+
## applications & patterns
58
+
59
+
### plyr.fm mcp server
60
+
an mcp server that exposes a music library (plyr.fm) to llm clients.
61
+
- **purpose**: allows llms to query track information, search the library, and get user-specific data (e.g., liked tracks).
62
+
- **design**: primarily **read-only** tools (e.g., `list_tracks`, `get_track`, `search`). mutations are handled by a separate cli.
63
+
- **source**: [zzstoatzz/plyr-python-client](https://github.com/zzstoatzz/plyr-python-client/tree/main/packages/plyrfm-mcp)
64
+
65
+
### prefect mcp server
66
+
an mcp server for interacting with prefect, a workflow orchestration system.
67
+
- **purpose**: enables llms to monitor and manage prefect workflows.
68
+
- **design**: exposes monitoring tools (read-only) and provides guidance for **mutations** via the prefect cli.
69
+
- **pattern**: emphasizes "agent-friendly usage" of the prefect cli, including `--no-prompt` and `prefect api` for json output, to facilitate programmatic interaction by llms.
70
+
- **source**: [prefecthq/prefect-mcp-server](https://github.com/PrefectHQ/prefect-mcp-server)
71
+
72
+
## ecosystem
73
+
74
+
- [fastmcp](./fastmcp.md) - pythonic server framework
75
+
- [pdsx](https://github.com/zzstoatzz/pdsx) - mcp server for atproto
76
+
- [inspector](https://github.com/modelcontextprotocol/inspector) - web-based debugger for mcp servers
77
+
78
+
## sources
79
+
80
+
- [modelcontextprotocol.io](https://modelcontextprotocol.io) - official documentation
81
+
- [jlowin/fastmcp](https://github.com/jlowin/fastmcp) - the fastmcp python library
+79
protocols/MCP/fastmcp.md
+79
protocols/MCP/fastmcp.md
···
1
+
# fastmcp
2
+
3
+
a high-level python framework for building mcp servers. it aims to simplify development by abstracting protocol details, emphasizing developer experience and type safety.
4
+
5
+
## philosophy
6
+
7
+
fastmcp is designed to be a fast, simple, and complete solution for mcp development. unlike a low-level sdk, it provides a "pythonic" and declarative way to define mcp servers, handling complex protocol intricacies automatically. it leverages python's type hints and docstrings to generate mcp schemas (for tools, resources, and prompts) automatically.
8
+
9
+
## basic usage
10
+
11
+
define your mcp server and expose functionality using decorators:
12
+
13
+
```python
14
+
from fastmcp import FastMCP, Context
15
+
16
+
# Initialize the MCP server
17
+
mcp = FastMCP("my-awesome-server", description="A server for managing awesome things.")
18
+
19
+
@mcp.tool()
20
+
def add_numbers(a: int, b: int) -> int:
21
+
"""Adds two numbers together.
22
+
23
+
:param a: The first number.
24
+
:param b: The second number.
25
+
:returns: The sum of the two numbers.
26
+
"""
27
+
return a + b
28
+
29
+
@mcp.resource("config://app/settings")
30
+
def get_app_settings() -> dict:
31
+
"""Retrieves the current application settings."""
32
+
return {"debug_mode": True, "log_level": "INFO"}
33
+
34
+
@mcp.prompt
35
+
def create_summary_prompt(text: str) -> str:
36
+
"""Generates a prompt to summarize the given text."""
37
+
return f"please summarize the following document:\n\n{text}"
38
+
39
+
@mcp.tool()
40
+
async def fetch_external_data(ctx: Context, url: str) -> str:
41
+
"""Fetches data from an external URL.
42
+
43
+
:param ctx: The MCP context for logging.
44
+
:param url: The URL to fetch.
45
+
:returns: The content of the URL as a string.
46
+
"""
47
+
ctx.info(f"fetching data from {url}")
48
+
# In a real scenario, this would use an async http client
49
+
return f"content from {url}"
50
+
51
+
# Run the server
52
+
if __name__ == "__main__":
53
+
mcp.run() # Automatically detects and uses STDIO or HTTP transport
54
+
```
55
+
56
+
## key features & patterns
57
+
58
+
### decorator-based definitions
59
+
tools, resources, and prompts are defined using simple python functions decorated with `@mcp.tool`, `@mcp.resource`, and `@mcp.prompt`. fastmcp infers schemas from type hints and docstrings.
60
+
61
+
### context injection
62
+
the `Context` object can be injected into tools, resources, or prompts, providing access to mcp session capabilities like logging (`ctx.info()`), sending progress updates (`ctx.session.send_progress()`), and making llm sampling requests (`ctx.session.sample()`)
63
+
64
+
### flexible transports
65
+
fastmcp abstracts the underlying transport. servers can run over stdio (for local integration) or streamable http (for remote deployments) with minimal code changes.
66
+
67
+
### server composition & proxying
68
+
fastmcp supports advanced patterns like combining multiple mcp servers into a single endpoint or proxying requests to other mcp servers.
69
+
70
+
### authentication
71
+
built-in support for various authentication mechanisms, including oauth providers and custom api keys.
72
+
73
+
### client library
74
+
provides a `fastmcp.Client` for programmatic interaction with any mcp server, supporting diverse transports and server-initiated llm sampling.
75
+
76
+
## sources
77
+
78
+
- [github.com/jlowin/fastmcp](https://github.com/jlowin/fastmcp) - fastmcp source code and documentation
79
+
- [pypi.org/project/fastmcp](https://pypi.org/project/fastmcp/) - fastmcp package on pypi
+8
protocols/README.md
+8
protocols/README.md
+78
protocols/atproto/README.md
+78
protocols/atproto/README.md
···
1
+
# atproto
2
+
3
+
the AT Protocol. an open protocol for decentralized social applications.
4
+
5
+
## philosophy
6
+
7
+
**atmospheric computing**: a paradigm of "connected clouds." if traditional servers are "the cloud" (centralized, closed), the AT Protocol creates an "atmosphere" where millions of personal clouds float and interoperate.
8
+
9
+
- **sovereign**: users run their own "personal cloud" (PDS) and own their identity.
10
+
- **connected**: services aggregate data from these personal clouds to build shared experiences.
11
+
- **open**: applications compete on service quality, not by locking away data.
12
+
13
+
## architecture
14
+
15
+
```
16
+
┌─────────┐ ┌─────────┐ ┌─────────┐
17
+
│ PDS │ │ PDS │ │ PDS │ user repos
18
+
└────┬────┘ └────┬────┘ └────┬────┘
19
+
│ │ │
20
+
└───────────────┼───────────────┘
21
+
│ firehose
22
+
▼
23
+
┌───────────┐
24
+
│ Relay │ aggregates events
25
+
└─────┬─────┘
26
+
│ firehose
27
+
┌─────────────┼─────────────┐
28
+
▼ ▼ ▼
29
+
┌─────────┐ ┌───────────┐ ┌─────────┐
30
+
│ AppView │ │ Feed │ │ Labeler │ consume & process
31
+
│ │ │ Generator │ │ │
32
+
└────┬────┘ └───────────┘ └─────────┘
33
+
│
34
+
│ read/write
35
+
▼
36
+
┌─────────┐
37
+
│ PDS │ back to user repos
38
+
└─────────┘
39
+
```
40
+
41
+
### components
42
+
43
+
**PDS (Personal Data Server)** is your "personal cloud". it hosts your account, stores your data repo (a signed merkle tree), handles auth, and emits changes. users can migrate their PDS without losing identity or data.
44
+
45
+
**Relay** aggregates firehose streams from many PDSes into one. an optimization - downstream services subscribe to one relay instead of thousands of PDSes. multiple relays can exist; anyone can run one.
46
+
47
+
**AppView** indexes firehose data into queryable databases. serves the application UI - timelines, search, notifications. also proxies writes back to user PDSes. this is what most people think of as "the app."
48
+
49
+
**Feed Generator** subscribes to firehose, applies custom selection logic, returns post URIs on request. enables algorithmic choice - the "For You" feed on bluesky runs on someone's gaming PC.
50
+
51
+
**Labeler** produces signed metadata about content (spam, nsfw, etc). appviews and clients subscribe to labelers they trust. enables moderation without centralized control.
52
+
53
+
### data flow
54
+
55
+
1. user creates a record (post, like, follow) via client
56
+
2. client sends write to appview, which forwards to user's PDS
57
+
3. PDS commits record to repo, emits event to firehose
58
+
4. relay aggregates, downstream services consume
59
+
5. appviews update their indices, labelers apply labels
60
+
6. next time someone requests that content, appview serves from index
61
+
62
+
## contents
63
+
64
+
- [identity](./identity.md) - DIDs, handles, resolution
65
+
- [data](./data.md) - repos, records, collections, references
66
+
- [lexicons](./lexicons.md) - schema language, namespaces
67
+
- [firehose](./firehose.md) - event streaming, jetstream
68
+
- [auth](./auth.md) - OAuth, scopes, permission sets
69
+
- [labels](./labels.md) - moderation, signed assertions
70
+
71
+
## sources
72
+
73
+
- [atproto.com](https://atproto.com) - official documentation
74
+
- [atmospheric computing](https://www.pfrazee.com/blog/atmospheric-computing) - paul frazee on the "connected clouds" paradigm
75
+
- [introduction to atproto](https://mackuba.eu/2025/08/20/introduction-to-atproto/) - mackuba
76
+
- [federation architecture](https://bsky.social/about/blog/5-5-2023-federation-architecture) - bluesky
77
+
- [plyr.fm](https://github.com/zzstoatzz/plyr.fm) - music streaming on atproto
78
+
- [pdsx](https://github.com/zzstoatzz/pdsx) - atproto CLI/MCP
+109
protocols/atproto/auth.md
+109
protocols/atproto/auth.md
···
1
+
# auth
2
+
3
+
atproto uses OAuth 2.0 for application authorization.
4
+
5
+
## the flow
6
+
7
+
1. user visits application
8
+
2. application redirects to user's PDS for authorization
9
+
3. user approves requested scopes
10
+
4. PDS redirects back with authorization code
11
+
5. application exchanges code for tokens
12
+
6. application uses tokens to act on user's behalf
13
+
14
+
standard OAuth, but the authorization server is the user's PDS, not a central service.
15
+
16
+
## scopes
17
+
18
+
scopes define what an application can do:
19
+
20
+
```
21
+
atproto # full access (legacy)
22
+
repo:fm.plyr.track # read/write fm.plyr.track collection
23
+
repo:fm.plyr.like # read/write fm.plyr.like collection
24
+
repo:read # read-only access to repo
25
+
```
26
+
27
+
granular scopes let users grant minimal permissions. an app that only needs to read your profile shouldn't have write access to your posts.
28
+
29
+
## permission sets
30
+
31
+
listing individual scopes is noisy. permission sets bundle them under human-readable names:
32
+
33
+
```
34
+
include:fm.plyr.authFullApp # "plyr.fm Music Library"
35
+
```
36
+
37
+
instead of seeing `fm.plyr.track, fm.plyr.like, fm.plyr.comment, ...`, users see a single permission with a description.
38
+
39
+
permission sets are lexicons published to `com.atproto.lexicon.schema` on your authority repo.
40
+
41
+
from [plyr.fm permission sets](https://github.com/zzstoatzz/plyr.fm/blob/main/docs/lexicons/overview.md#permission-sets)
42
+
43
+
## session management
44
+
45
+
tokens expire. applications need refresh logic:
46
+
47
+
```python
48
+
class SessionManager:
49
+
def __init__(self, session_path: Path):
50
+
self.session_path = session_path
51
+
self._client: AsyncClient | None = None
52
+
53
+
async def get_client(self) -> AsyncClient:
54
+
if self._client:
55
+
return self._client
56
+
57
+
# try loading saved session
58
+
if self.session_path.exists():
59
+
session_str = self.session_path.read_text()
60
+
self._client = AsyncClient()
61
+
await self._client.login(session_string=session_str)
62
+
self._client.on_session_change(self._save_session)
63
+
return self._client
64
+
65
+
# fall back to fresh login
66
+
self._client = AsyncClient()
67
+
await self._client.login(handle, password)
68
+
self._save_session(None, None)
69
+
return self._client
70
+
71
+
def _save_session(self, event, session):
72
+
self.session_path.write_text(self._client.export_session_string())
73
+
```
74
+
75
+
from [bot](https://github.com/zzstoatzz/bot) - persists sessions to disk, refreshes automatically.
76
+
77
+
## per-request credentials
78
+
79
+
for multi-tenant applications (one backend serving many users), credentials come per-request:
80
+
81
+
```python
82
+
# middleware extracts from headers
83
+
x-atproto-handle: user.handle
84
+
x-atproto-password: app-password
85
+
86
+
# or from OAuth session
87
+
authorization: Bearer <token>
88
+
```
89
+
90
+
from [pdsx MCP server](https://github.com/zzstoatzz/pdsx) - accepts credentials via HTTP headers for multi-tenant deployment.
91
+
92
+
## app passwords
93
+
94
+
for bots and automated tools, app passwords are simpler than full OAuth:
95
+
96
+
1. user creates app password in their PDS settings
97
+
2. bot uses handle + app password to authenticate
98
+
3. no redirect flow needed
99
+
100
+
app passwords have full account access. use OAuth with scopes when you need granular permissions.
101
+
102
+
## why this matters
103
+
104
+
OAuth at the protocol level means:
105
+
106
+
- users authorize apps, not the other way around
107
+
- applications can't lock in users by controlling auth
108
+
- the same identity works across all atmospheric applications
109
+
- granular scopes enable minimal-permission applications
+111
protocols/atproto/data.md
+111
protocols/atproto/data.md
···
1
+
# data
2
+
3
+
atproto's data model: each user is a signed database.
4
+
5
+
## repos
6
+
7
+
a repository is a user's data store. it contains all their records - posts, likes, follows, whatever the applications define.
8
+
9
+
repos are merkle trees. every commit is signed by the user's key and can be verified by anyone. this is what enables authenticated data gossip - you don't need to trust the messenger, you verify the signature.
10
+
11
+
## records
12
+
13
+
records are JSON documents stored in collections:
14
+
15
+
```
16
+
at://did:plc:xyz/app.bsky.feed.post/3jui7akfj2k2a
17
+
└── DID ──┘ └── collection ───┘ └── rkey ──┘
18
+
```
19
+
20
+
- **DID**: whose repo
21
+
- **collection**: the record type (lexicon NSID)
22
+
- **rkey**: record key within the collection
23
+
24
+
record keys are typically TIDs (timestamp-based IDs) for records where users have many (posts, likes). for singletons like profiles, the literal `self` is used.
25
+
26
+
## AT-URIs
27
+
28
+
the `at://` URI scheme identifies records:
29
+
30
+
```
31
+
at://did:plc:xyz/fm.plyr.track/3jui7akfj2k2a
32
+
at://zzstoatzz.io/app.bsky.feed.post/3jui7akfj2k2a # handle also works
33
+
```
34
+
35
+
these are stable references. the URI uniquely identifies a record across the network.
36
+
37
+
## CIDs
38
+
39
+
a CID (Content Identifier) is a hash of a specific version of a record:
40
+
41
+
```
42
+
bafyreig2fjxi3qbp5jvyqx2i4djxfkp...
43
+
```
44
+
45
+
URIs identify *what*, CIDs identify *which version*. when you reference another record and care about the exact content, you include both.
46
+
47
+
## strongRef
48
+
49
+
the standard pattern for cross-record references:
50
+
51
+
```json
52
+
{
53
+
"subject": {
54
+
"uri": "at://did:plc:xyz/fm.plyr.track/abc123",
55
+
"cid": "bafyreig..."
56
+
}
57
+
}
58
+
```
59
+
60
+
used in likes (referencing tracks), comments (referencing tracks), lists (referencing any records). the CID proves you're referencing a specific version.
61
+
62
+
from [plyr.fm lexicons](https://github.com/zzstoatzz/plyr.fm/tree/main/lexicons) - likes, comments, and lists all use strongRef.
63
+
64
+
## collections
65
+
66
+
records are grouped into collections by type:
67
+
68
+
```
69
+
repo/
70
+
├── app.bsky.feed.post/
71
+
│ ├── 3jui7akfj2k2a
72
+
│ └── 3jui8bklg3l3b
73
+
├── app.bsky.feed.like/
74
+
│ └── ...
75
+
└── fm.plyr.track/
76
+
└── ...
77
+
```
78
+
79
+
each collection corresponds to a lexicon. applications read and write to collections they understand.
80
+
81
+
## local indexing
82
+
83
+
querying across PDSes is slow. applications maintain local indexes:
84
+
85
+
```sql
86
+
-- plyr.fm indexes fm.plyr.track records
87
+
CREATE TABLE tracks (
88
+
id SERIAL PRIMARY KEY,
89
+
did TEXT NOT NULL,
90
+
rkey TEXT NOT NULL,
91
+
uri TEXT NOT NULL,
92
+
cid TEXT,
93
+
title TEXT NOT NULL,
94
+
artist TEXT NOT NULL,
95
+
-- ... application-specific fields
96
+
UNIQUE(did, rkey)
97
+
);
98
+
```
99
+
100
+
when users log in, sync their records from PDS to local database. background jobs keep indexes fresh.
101
+
102
+
from [plyr.fm](https://github.com/zzstoatzz/plyr.fm) - indexes tracks, likes, comments, playlists locally.
103
+
104
+
## why this matters
105
+
106
+
the "each user is one database" model is the foundation of **atmospheric computing**:
107
+
108
+
- **portability**: your "personal cloud" is yours. if a host fails, you move your data elsewhere.
109
+
- **verification**: trust is cryptographic. you verify the data signature, not the provider.
110
+
- **aggregation**: applications weave together data from millions of personal clouds into a cohesive "atmosphere."
111
+
- **interop**: apps share schemas, so my music player can read your social graph.
+163
protocols/atproto/firehose.md
+163
protocols/atproto/firehose.md
···
1
+
# firehose
2
+
3
+
the firehose is atproto's event stream. applications subscribe to it to build aggregated views of network activity.
4
+
5
+
## the CDC model
6
+
7
+
each PDS produces a CDC (Change Data Capture) log of commits. when a user creates, updates, or deletes a record, the PDS emits a signed event. applications consume these events and update their local databases.
8
+
9
+
```
10
+
User commits record → PDS emits event → Applications consume → Local DB updated
11
+
```
12
+
13
+
this is the "sharded by user, aggregated by app" model. users have strict serial ordering within their repos. applications see causal ordering across users.
14
+
15
+
## firehose vs jetstream
16
+
17
+
two ways to consume network events:
18
+
19
+
### firehose (raw)
20
+
21
+
the protocol-level stream. binary format (CBOR), includes full cryptographic proofs.
22
+
23
+
```
24
+
com.atproto.sync.subscribeRepos
25
+
```
26
+
27
+
- full merkle proofs for verification
28
+
- CAR file blocks for data
29
+
- higher bandwidth, more complex parsing
30
+
- required for archival, moderation decisions, anything needing authenticity guarantees
31
+
32
+
### jetstream
33
+
34
+
a simplified relay. JSON format, filtering support.
35
+
36
+
```
37
+
wss://jetstream2.us-east.bsky.network/subscribe
38
+
```
39
+
40
+
- JSON encoding (easier to parse)
41
+
- filter by collection or DID
42
+
- compressed, lower bandwidth
43
+
- no cryptographic proofs - data isn't self-authenticating
44
+
45
+
use jetstream for:
46
+
- prototypes and experiments
47
+
- bots and interactive tools
48
+
- applications where you trust the relay
49
+
50
+
use firehose for:
51
+
- archival systems
52
+
- moderation services
53
+
- anything requiring proof of authenticity
54
+
55
+
from [music-atmosphere-feed](https://github.com/zzstoatzz/music-atmosphere-feed) - uses jetstream to filter posts containing music links.
56
+
57
+
## event structure
58
+
59
+
firehose events contain:
60
+
61
+
```
62
+
repo: did:plc:xyz # whose repo
63
+
rev: 3jui7akfj2k2a # commit revision
64
+
seq: 12345 # sequence number
65
+
time: 2024-01-01T00:00:00Z # timestamp
66
+
ops: [ # operations in this commit
67
+
{
68
+
action: "create", # create, update, delete
69
+
path: "app.bsky.feed.post/abc123",
70
+
cid: bafyrei...
71
+
}
72
+
]
73
+
blocks: <CAR data> # actual record content
74
+
```
75
+
76
+
jetstream simplifies this to JSON with the record content inline.
77
+
78
+
## consuming events
79
+
80
+
pattern from plyr.fm and follower-weight:
81
+
82
+
```python
83
+
async def consume_firehose():
84
+
async for event in firehose.subscribe():
85
+
for op in event.ops:
86
+
if op.collection == "fm.plyr.track":
87
+
if op.action == "create":
88
+
await index_track(event.repo, op.rkey, op.record)
89
+
elif op.action == "delete":
90
+
await remove_track(event.repo, op.rkey)
91
+
```
92
+
93
+
## batch processing
94
+
95
+
for high-volume consumption, batch by operation type and flush on size OR time:
96
+
97
+
```python
98
+
BATCH_SIZE = 1000
99
+
FLUSH_INTERVAL = 2.0
100
+
101
+
follow_batch: list[tuple[str, str, str]] = [] # (follower_did, rkey, subject_did)
102
+
unfollow_batch: list[tuple[str, str]] = [] # (follower_did, rkey)
103
+
pending_acks: set[int] = set()
104
+
last_flush = time.monotonic()
105
+
106
+
async def flush(ws):
107
+
if follow_batch:
108
+
await db.batch_upsert_follows(follow_batch)
109
+
follow_batch.clear()
110
+
if unfollow_batch:
111
+
await db.batch_delete_follows(unfollow_batch)
112
+
unfollow_batch.clear()
113
+
# ack AFTER persist
114
+
for ack_id in sorted(pending_acks):
115
+
await ws.send(json.dumps({"type": "ack", "id": ack_id}))
116
+
pending_acks.clear()
117
+
118
+
while True:
119
+
try:
120
+
msg = await asyncio.wait_for(ws.recv(), timeout=0.1)
121
+
except asyncio.TimeoutError:
122
+
# time-based flush for low-volume periods
123
+
if time.monotonic() - last_flush > FLUSH_INTERVAL:
124
+
await flush(ws)
125
+
last_flush = time.monotonic()
126
+
continue
127
+
128
+
event = json.loads(msg)
129
+
# ... parse and append to appropriate batch ...
130
+
pending_acks.add(event["id"])
131
+
132
+
if len(follow_batch) >= BATCH_SIZE:
133
+
await flush(ws)
134
+
last_flush = time.monotonic()
135
+
```
136
+
137
+
critical: ack cursor only after successful persistence. if you crash, you replay from last ack.
138
+
139
+
from [follower-weight/tap_consumer.py](https://github.com/zzstoatzz/follower-weight)
140
+
141
+
## cursor management
142
+
143
+
firehose supports resumption via cursor (sequence number):
144
+
145
+
```python
146
+
# resume from where we left off
147
+
cursor = await db.get_last_cursor()
148
+
async for event in firehose.subscribe(cursor=cursor):
149
+
# process...
150
+
await db.save_cursor(event.seq)
151
+
```
152
+
153
+
store cursor persistently. on restart, resume from stored position.
154
+
155
+
## why this matters
156
+
157
+
the firehose enables "cooperative computing":
158
+
159
+
- third parties can build first-party experiences (feeds, search, analytics)
160
+
- no API rate limits or access restrictions on public data
161
+
- applications compete on what they build, not what data they have access to
162
+
163
+
the For You algorithm on bluesky runs on someone's gaming PC, consuming the same firehose as bluesky itself.
+79
protocols/atproto/identity.md
+79
protocols/atproto/identity.md
···
1
+
# identity
2
+
3
+
identity in atproto separates "who you are" from "where you're hosted."
4
+
5
+
## DIDs
6
+
7
+
a DID (Decentralized Identifier) is your permanent identity. it looks like:
8
+
9
+
```
10
+
did:plc:xbtmt2zjwlrfegqvch7fboei
11
+
```
12
+
13
+
the DID never changes, even if you move to a different PDS. this is what makes account migration possible - your identity isn't tied to your host.
14
+
15
+
atproto primarily uses `did:plc`, where the PLC Directory (`plc.directory`) maintains a mapping from DIDs to their current metadata: signing keys, PDS location, and associated handles.
16
+
17
+
`did:web` is also supported, using DNS as the resolution mechanism. this gives you full control but requires maintaining infrastructure.
18
+
19
+
## handles
20
+
21
+
a handle is the human-readable name:
22
+
23
+
```
24
+
zzstoatzz.io
25
+
pfrazee.com
26
+
```
27
+
28
+
handles are DNS-based. you prove ownership by either:
29
+
- adding a DNS TXT record at `_atproto.yourdomain.com`
30
+
- serving a file at `/.well-known/atproto-did`
31
+
32
+
handles can change. they're aliases to DIDs, not identities themselves. if you lose a domain, you lose the handle but keep your DID and all your data.
33
+
34
+
## resolution
35
+
36
+
to find a user:
37
+
38
+
1. resolve handle → DID (via DNS or well-known)
39
+
2. resolve DID → DID document (via PLC directory)
40
+
3. DID document contains PDS endpoint
41
+
4. query PDS for data
42
+
43
+
```python
44
+
# simplified resolution flow
45
+
handle = "zzstoatzz.io"
46
+
did = resolve_handle(handle) # → did:plc:...
47
+
doc = resolve_did(did) # → {service: [...], alsoKnownAs: [...]}
48
+
pds_url = doc["service"][0]["serviceEndpoint"]
49
+
```
50
+
51
+
## caching
52
+
53
+
DID resolution is expensive (HTTP calls to PLC directory). cache aggressively:
54
+
55
+
```python
56
+
_did_cache: dict[str, tuple[str, float]] = {}
57
+
DID_CACHE_TTL = 3600 # 1 hour
58
+
59
+
async def get_did(handle: str) -> str:
60
+
if handle in _did_cache:
61
+
did, ts = _did_cache[handle]
62
+
if time.time() - ts < DID_CACHE_TTL:
63
+
return did
64
+
did = await resolve_handle(handle)
65
+
_did_cache[handle] = (did, time.time())
66
+
return did
67
+
```
68
+
69
+
from [at-me](https://github.com/zzstoatzz/at-me) - caches DID resolutions with 1-hour TTL.
70
+
71
+
## why this matters
72
+
73
+
the separation of identity (DID) from location (PDS) and presentation (handle) is what enables the "connected clouds" model. you can:
74
+
75
+
- switch PDS providers without losing followers
76
+
- use your own domain as your identity
77
+
- maintain identity even if banned from specific applications
78
+
79
+
your identity is yours. hosting is a service you can change.
+128
protocols/atproto/labels.md
+128
protocols/atproto/labels.md
···
1
+
# labels
2
+
3
+
labels are signed assertions about content. they're how applications do moderation without affecting the underlying data.
4
+
5
+
## the key distinction
6
+
7
+
remember the two roles:
8
+
9
+
- **PDS**: hosts accounts, affects all applications
10
+
- **Application**: consumes data, affects only itself
11
+
12
+
if a PDS takes down your account, you're gone from all applications until you migrate. this is the nuclear option - reserved for illegal content and network abuse.
13
+
14
+
labels are the application-level mechanism. when bluesky labels your content, it affects bluesky. leaflet can ignore those labels entirely.
15
+
16
+
from [update on protocol moderation](https://leaflet.pub/pfrazee.com/3lgy73zy4bc2a) - paul frazee
17
+
18
+
## what labels are
19
+
20
+
labels are metadata objects, not repository records. they don't live in anyone's repo. a labeler service produces them and serves them via XRPC.
21
+
22
+
```json
23
+
{
24
+
"ver": 1,
25
+
"src": "did:plc:labeler-did",
26
+
"uri": "at://did:plc:xyz/fm.plyr.track/abc123",
27
+
"cid": "bafyreig...",
28
+
"val": "copyright-violation",
29
+
"neg": false,
30
+
"cts": "2025-11-30T12:00:00.000Z",
31
+
"sig": "<base64-signature>"
32
+
}
33
+
```
34
+
35
+
- **src**: who made this assertion (labeler DID)
36
+
- **uri**: what content it's about
37
+
- **val**: the label value
38
+
- **neg**: true if this negates a previous label
39
+
- **sig**: cryptographic signature
40
+
41
+
## signed assertions
42
+
43
+
labels are signed using DAG-CBOR + secp256k1 (same as repo commits). anyone can verify a label came from the claimed labeler by checking the signature against the labeler's public key in their DID document.
44
+
45
+
this enables trust decisions: you can choose which labelers you trust and how to interpret their labels.
46
+
47
+
## labeler services
48
+
49
+
a labeler is a service that:
50
+
51
+
1. analyzes content (automated or manual review)
52
+
2. produces signed labels
53
+
3. serves labels via `com.atproto.label.queryLabels`
54
+
55
+
```
56
+
POST /emit-label
57
+
{
58
+
"uri": "at://did:plc:xyz/fm.plyr.track/abc123",
59
+
"val": "copyright-violation",
60
+
"cid": "bafyreig..."
61
+
}
62
+
```
63
+
64
+
from [plyr.fm moderation service](https://github.com/zzstoatzz/plyr.fm/blob/main/docs/moderation/atproto-labeler.md) - runs copyright detection, emits labels for flagged tracks.
65
+
66
+
## stackable moderation
67
+
68
+
multiple labelers can label the same content. applications choose:
69
+
70
+
- which labelers to subscribe to
71
+
- how to interpret each label value
72
+
- what action to take (hide, warn, ignore)
73
+
74
+
this is "stackable moderation" - layers of independent assertions that clients compose into a moderation policy.
75
+
76
+
## negation
77
+
78
+
to revoke a label, emit the same label with `neg: true`:
79
+
80
+
```json
81
+
{
82
+
"uri": "at://did:plc:xyz/fm.plyr.track/abc123",
83
+
"val": "copyright-violation",
84
+
"neg": true
85
+
}
86
+
```
87
+
88
+
use cases:
89
+
- false positive resolved after review
90
+
- artist provided licensing proof
91
+
- DMCA counter-notice accepted
92
+
93
+
## label values
94
+
95
+
common patterns:
96
+
97
+
| val | meaning |
98
+
|-----|---------|
99
+
| `!takedown` | strong: hide from view |
100
+
| `!warn` | show warning before content |
101
+
| `copyright-violation` | potential copyright issue |
102
+
| `explicit` | adult content |
103
+
| `spam` | suspected spam |
104
+
105
+
applications define how to handle each value. `!takedown` conventionally means "don't show this" but applications make that choice.
106
+
107
+
## querying labels
108
+
109
+
```
110
+
GET /xrpc/com.atproto.label.queryLabels?uriPatterns=at://did:plc:*
111
+
112
+
{
113
+
"cursor": "456",
114
+
"labels": [...]
115
+
}
116
+
```
117
+
118
+
applications can query labels for content they're about to display and apply their moderation policy.
119
+
120
+
## why this matters
121
+
122
+
labels separate moderation concerns:
123
+
124
+
- **PDS operators** handle illegal content and network abuse
125
+
- **Applications** handle policy violations for their context
126
+
- **Users** can choose which labelers to trust
127
+
128
+
no single entity controls moderation for the entire network. applications compete on moderation quality. users can route around overzealous or insufficient moderation by choosing different apps or labelers.
+122
protocols/atproto/lexicons.md
+122
protocols/atproto/lexicons.md
···
1
+
# lexicons
2
+
3
+
lexicons are atproto's schema system. they define what records look like and what APIs accept.
4
+
5
+
## NSIDs
6
+
7
+
a Namespace ID identifies a lexicon:
8
+
9
+
```
10
+
fm.plyr.track
11
+
app.bsky.feed.post
12
+
com.atproto.repo.createRecord
13
+
```
14
+
15
+
format is reverse-DNS. the domain owner controls that namespace. this prevents collisions and makes ownership clear.
16
+
17
+
## defining a lexicon
18
+
19
+
```json
20
+
{
21
+
"$type": "com.atproto.lexicon",
22
+
"id": "fm.plyr.track",
23
+
"defs": {
24
+
"main": {
25
+
"type": "record",
26
+
"key": "tid",
27
+
"record": {
28
+
"type": "object",
29
+
"required": ["title", "artist", "audioUrl", "createdAt"],
30
+
"properties": {
31
+
"title": {"type": "string"},
32
+
"artist": {"type": "string"},
33
+
"audioUrl": {"type": "string", "format": "uri"},
34
+
"album": {"type": "string"},
35
+
"duration": {"type": "integer"},
36
+
"createdAt": {"type": "string", "format": "datetime"}
37
+
}
38
+
}
39
+
}
40
+
}
41
+
}
42
+
```
43
+
44
+
from [plyr.fm/lexicons/track.json](https://github.com/zzstoatzz/plyr.fm/blob/main/lexicons/track.json)
45
+
46
+
## record keys
47
+
48
+
- **tid**: timestamp-based ID. for records where users have many (tracks, likes, posts).
49
+
- **literal:self**: singleton. for records where users have one (profile).
50
+
51
+
```json
52
+
"key": "tid" // generates 3jui7akfj2k2a
53
+
"key": "literal:self" // always "self"
54
+
```
55
+
56
+
## knownValues
57
+
58
+
extensible enums. the schema declares known values but validators won't reject unknown ones:
59
+
60
+
```json
61
+
"listType": {
62
+
"type": "string",
63
+
"knownValues": ["album", "playlist", "liked"]
64
+
}
65
+
```
66
+
67
+
this allows schemas to evolve without breaking existing records. new values can be added; old clients just won't recognize them.
68
+
69
+
from [plyr.fm list lexicon](https://github.com/zzstoatzz/plyr.fm/blob/main/lexicons/list.json)
70
+
71
+
## namespace discipline
72
+
73
+
plyr.fm uses environment-aware namespaces:
74
+
75
+
| environment | namespace |
76
+
|-------------|-----------|
77
+
| production | `fm.plyr` |
78
+
| staging | `fm.plyr.stg` |
79
+
| development | `fm.plyr.dev` |
80
+
81
+
never hardcode namespaces. configure via settings so dev/staging don't pollute production data.
82
+
83
+
important: don't reuse another app's lexicons even for similar concepts. plyr.fm defines `fm.plyr.like` rather than using `app.bsky.feed.like`. this maintains namespace isolation and avoids coupling to another app's schema evolution.
84
+
85
+
## shared lexicons
86
+
87
+
for true interoperability, multiple apps can agree on a common schema:
88
+
89
+
```
90
+
audio.ooo.track # shared schema for audio content
91
+
```
92
+
93
+
plyr.fm writes to `audio.ooo.track` (production) so other audio apps can read the same records. this follows the pattern at [standard.site](https://standard.site).
94
+
95
+
benefits:
96
+
- one schema for discovery, any app can read it
97
+
- content is portable - tracks live in your PDS, playable anywhere
98
+
- platform-specific features live as extensions, not forks
99
+
100
+
from [plyr.fm shared audio lexicon research](https://github.com/zzstoatzz/plyr.fm/blob/main/docs/research/2026-01-03-shared-audio-lexicon.md)
101
+
102
+
## schema evolution
103
+
104
+
atproto schemas can only:
105
+
- add optional fields
106
+
- add new knownValues
107
+
108
+
you cannot:
109
+
- remove fields
110
+
- change required fields
111
+
- change field types
112
+
113
+
plan schemas carefully. once published, breaking changes aren't possible.
114
+
115
+
## why this matters
116
+
117
+
lexicons enable the "cooperative computing" model:
118
+
119
+
- apps agree on schemas → they can read each other's data
120
+
- namespace ownership → no collisions, clear responsibility
121
+
- extensibility → schemas evolve without breaking
122
+
- shared lexicons → true cross-app interoperability