1
2## git repo contents
3
4Run with, eg, `go run ./cmd/rainbow`):
5
6- `cmd/bigsky`: relay daemon
7- `cmd/relay`: new (sync v1.1) relay daemon
8- `cmd/palomar`: search indexer and query service (OpenSearch)
9- `cmd/gosky`: client CLI for talking to a PDS
10- `cmd/lexgen`: codegen tool for lexicons (Lexicon JSON to Go package)
11- `cmd/stress`: connects to local/default PDS and creates a ton of random posts
12- `cmd/beemo`: slack bot for moderation reporting (Bluesky Moderation Observer)
13- `cmd/fakermaker`: helper to generate fake accounts and content for testing
14- `cmd/supercollider`: event stream load generation tool
15- `cmd/sonar`: event stream monitoring tool
16- `cmd/hepa`: auto-moderation rule engine service
17- `cmd/rainbow`: firehose fanout service
18- `cmd/bluepages`: identity directory service
19- `gen`: dev tool to run CBOR type codegen
20
21Packages:
22
23- `api`: mostly output of lexgen (codegen) for lexicons: structs, CBOR marshaling. some higher-level code, and a PLC client (may rename)
24 - `api/atproto`: generated types for `com.atproto` lexicon
25 - `api/agnostic`: variants of `com.atproto` types which work better with unknown lexicon data
26 - `api/bsky`: generated types for `app.bsky` lexicon
27 - `api/chat`: generated types for `chat.bsky` lexicon
28 - `api/ozone`: generated types for `tools.ozone` lexicon
29- `atproto/crypto`: cryptographic helpers (signing, key generation and serialization)
30- `atproto/syntax`: string types and parsers for identifiers, datetimes, etc
31- `atproto/identity`: DID and handle resolution
32- `atproto/data`: helpers for atproto data as JSON or CBOR with unknown schema
33- `atproto/lexicon`: lexicon validation of generic data
34- `atproto/repo`: repo and MST implementation
35- `automod`: moderation and anti-spam rules engine
36- `bgs`: relay server implementation for crawling, etc (for bigsky implementation)
37- `carstore`: library for storing repo data in CAR files on disk, plus a metadata SQL db
38- `events`: types, codegen CBOR helpers, and persistence for event feeds
39- `indexer`: aggregator, handling like counts etc in SQL database
40- `lex`: implements codegen for Lexicons (!)
41- `models`: database types/models/schemas; shared in several places
42- `mst`: merkle search tree implementation
43- `notifs`: helpers for notification objects (hydration, etc)
44- `pds`: PDS server implementation
45- `plc`: implementation of a *fake* PLC server (not persisted), and a PLC client
46- `repo`: implements atproto repo on top of a blockstore. CBOR types
47- `repomgr`: wraps many repos with a single carstore backend. handles events, locking
48- `search`: search server implementation
49- `testing`: integration tests; testing helpers
50- `util`: a few common definitions (may rename)
51- `xrpc`: XRPC client (not server) helpers
52
53
54## Jargon
55
56- Relay: service which crawls/consumes content from "all" PDSs and re-broadcasts as a firehose
57- BGS: Big Graph Service, previous name for what is now "Relay"
58- PDS: Personal Data Server (or Service), which stores user atproto repositories and acts as a user agent in the network
59- CLI: Command Line Tool
60- CBOR: a binary serialization format, smilar to JSON
61- PLC: "placeholder" DID provider, see <https://web.plc.directory>
62- DID: Decentralized IDentifier, a flexible W3C specification for persistent identifiers in URI form (eg, `did:plc:abcd1234`)
63- XRPC: atproto convention for HTTP GET and POST endpoints specified by namespaced Lexicon schemas
64- CAR: simple file format for storing binary content-addressed blocks/blobs, sort of like .tar files
65- CID: content identifier for binary blobs, basically a flexible encoding of hash values
66- MST: Merkle Search Tree, a key/value map data structure using content addressed nodes
67
68
69## Lexicon and CBOR code generation
70
71`gen/main.go` has a list of types internal to packages in this repo which need CBOR helper codegen. If you edit those types, or update the listed types/packages, re-run codegen like:
72
73 # make sure everything can build cleanly first
74 make build
75
76 # then generate
77 go run ./gen
78
79To run codegen for new or updated Lexicons, using lexgen, first place (or git checkout) the JSON lexicon files at `../atproto/`. Then, in *this* repository (indigo), run commands like:
80
81 go run ./cmd/lexgen/ --package bsky --prefix app.bsky --outdir api/bsky ../atproto/lexicons/app/bsky/
82 go run ./cmd/lexgen/ --package atproto --prefix com.atproto --outdir api/atproto ../atproto/lexicons/com/atproto/
83
84You may want to delete all the codegen files before re-generating, to detect deleted files.
85
86It can require some manual munging between the lexgen step and a later `go run ./gen` to make sure things compile at least temporarily; otherwise the `gen` will not run. In some cases, you might also need to add new types to `./gen/main.go`.
87
88To generate server stubs and handlers, push them in a temporary directory first, then merge changes in to the actual PDS code:
89
90 mkdir tmppds
91 go run ./cmd/lexgen/ --package pds --gen-server --types-import com.atproto:github.com/bluesky-social/indigo/api/atproto --types-import app.bsky:github.com/bluesky-social/indigo/api/bsky --outdir tmppds --gen-handlers ../atproto/lexicons
92
93
94## Tips and Tricks
95
96When debugging websocket streams, the `websocat` tool (rust) can be helpful. CBOR binary is sort of mangled in to text by default. Eg:
97
98 # consume repo events from PDS
99 websocat ws://localhost:4989/events
100
101 # consume repo events from Relay
102 websocat ws://localhost:2470/events
103
104Send the Relay a ding-dong:
105
106 # tell Relay to consume from PDS
107 http --json post localhost:2470/add-target host="localhost:4989"
108
109Set the log level to be more verbose, using an env variable:
110
111 GOLOG_LOG_LEVEL=info go run ./cmd/pds
112
113
114## `gosky` basic usage
115
116Running against local typescript PDS in `dev-env` mode:
117
118 # as "alice" user
119 go run ./cmd/gosky/ --pds-host http://localhost:2583 account create-session alice.test hunter2 > bsky.auth
120
121The `bsky.auth` file is the default place that `gosky` and other client commands will look for auth info.
122
123
124## Integrated Development
125
126Sometimes it is helpful to run a PLC, PDS, Relay, and other components, all locally on your laptop, across languages. This section describes one setup for this.
127
128First, you need PostgreSQL running locally. This could be via docker, or the following commands assume some kind of debian/ubuntu setup with a postgres server package installed and running.
129
130Create a user and databases for PLC+PDS:
131
132 # use 'yksb' as weak default password for local-only dev
133 sudo -u postgres createuser -P -s bsky
134
135 sudo -u postgres createdb plc_dev -O bsky
136 sudo -u postgres createdb pds_dev -O bsky
137
138If you end up needing to wipe the databases:
139
140 sudo -u postgres dropdb plc_dev
141 sudo -u postgres dropdb pds_dev
142
143Checkout the `did-method-plc` repo in on terminal and run:
144
145 make run-dev-plc
146
147Checkout the `atproto` repo in another terminal and run:
148
149 make run-dev-pds
150
151In this repo (indigo), start a Relay, in two separate terminals:
152
153 make run-dev-relay
154
155In a final terminal, run fakermaker to inject data into the system:
156
157 # setup and create initial accounts; 100 by default
158 mkdir data/fakermaker/
159 export GOLOG_LOG_LEVEL=info
160 go run ./cmd/fakermaker/ gen-accounts > data/fakermaker/accounts.json
161
162 # create or update profiles for all the accounts
163 go run ./cmd/fakermaker/ gen-profiles
164
165 # create follow graph between accounts
166 go run ./cmd/fakermaker/ gen-graph
167
168 # create posts, including mentions and image uploads
169 go run ./cmd/fakermaker/ gen-posts
170
171 # create more interactions, such as likes, between accounts
172 go run ./cmd/fakermaker/ gen-interactions
173
174 # lastly, read-only queries, including timelines, notifications, and post threads
175 go run ./cmd/fakermaker/ run-browsing