pydantic#
pydantic is a library, not the language. but it's become foundational enough that it's worth understanding.
the core idea#
python's type hints don't do anything at runtime by default. def foo(x: int) accepts strings, floats, whatever - the annotation is just documentation.
pydantic makes them real. define a model with type hints, and pydantic validates and coerces data to match:
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
user = User(name="alice", age="25") # age coerced to int
user = User(name="alice", age="not a number") # raises ValidationError
this is why pydantic shows up everywhere in python - it bridges the gap between python's dynamic runtime and the desire for validated, typed data.
settings from environment#
the most common use: replacing os.getenv() calls with validated configuration.
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
"""settings for atproto cli."""
model_config = SettingsConfigDict(
env_file=str(Path.cwd() / ".env"),
env_file_encoding="utf-8",
extra="ignore",
case_sensitive=False,
)
atproto_pds_url: str = Field(
default="https://bsky.social",
description="PDS URL",
)
atproto_handle: str = Field(default="", description="Your atproto handle")
atproto_password: str = Field(default="", description="Your atproto app password")
settings = Settings()
model_config controls where settings come from (environment, .env files) and how to handle unknowns. required fields without defaults fail at import time - not later when you try to use them.
annotated types for reusable validation#
when multiple schemas share the same validation logic, bind it to the type itself instead of repeating @field_validator on each schema:
from datetime import timedelta
from typing import Annotated
from pydantic import AfterValidator, BaseModel
def _validate_non_negative_timedelta(v: timedelta) -> timedelta:
if v < timedelta(seconds=0):
raise ValueError("timedelta must be non-negative")
return v
NonNegativeTimedelta = Annotated[
timedelta,
AfterValidator(_validate_non_negative_timedelta)
]
class RunDeployment(BaseModel):
schedule_after: NonNegativeTimedelta
benefits:
- write validation once
- field types become swappable interfaces
- types are self-documenting
from coping with python's type system
model_validator for side effects#
run setup code when settings load:
from typing import Self
from pydantic import model_validator
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
debug: bool = False
@model_validator(mode="after")
def configure_logging(self) -> Self:
setup_logging(debug=self.debug)
return self
settings = Settings() # logging configured on import
the validator runs after all fields are set. use for side effects that depend on configuration values.
from bot/config.py
when to use what#
pydantic models are heavier than they look - they do a lot of work on instantiation. for internal data you control, python's dataclasses are simpler:
from dataclasses import dataclass
@dataclass
class BatchResult:
"""result of a batch operation."""
successful: list[str]
failed: list[tuple[str, Exception]]
@property
def total(self) -> int:
return len(self.successful) + len(self.failed)
no validation, no coercion, just a class with fields. use pydantic at boundaries (API input, config files, external data) where you need validation. use dataclasses for internal data structures.
sources: