porting all github actions from bluesky-social/indigo to tangled CI
1 2## git repo contents 3 4Run with, eg, `go run ./cmd/rainbow`): 5 6- `cmd/bigsky`: relay daemon 7- `cmd/relay`: new (sync v1.1) relay daemon 8- `cmd/palomar`: search indexer and query service (OpenSearch) 9- `cmd/gosky`: client CLI for talking to a PDS 10- `cmd/lexgen`: codegen tool for lexicons (Lexicon JSON to Go package) 11- `cmd/stress`: connects to local/default PDS and creates a ton of random posts 12- `cmd/beemo`: slack bot for moderation reporting (Bluesky Moderation Observer) 13- `cmd/fakermaker`: helper to generate fake accounts and content for testing 14- `cmd/supercollider`: event stream load generation tool 15- `cmd/sonar`: event stream monitoring tool 16- `cmd/hepa`: auto-moderation rule engine service 17- `cmd/rainbow`: firehose fanout service 18- `cmd/bluepages`: identity directory service 19- `gen`: dev tool to run CBOR type codegen 20 21Packages: 22 23- `api`: mostly output of lexgen (codegen) for lexicons: structs, CBOR marshaling. some higher-level code, and a PLC client (may rename) 24 - `api/atproto`: generated types for `com.atproto` lexicon 25 - `api/agnostic`: variants of `com.atproto` types which work better with unknown lexicon data 26 - `api/bsky`: generated types for `app.bsky` lexicon 27 - `api/chat`: generated types for `chat.bsky` lexicon 28 - `api/ozone`: generated types for `tools.ozone` lexicon 29- `atproto/crypto`: cryptographic helpers (signing, key generation and serialization) 30- `atproto/syntax`: string types and parsers for identifiers, datetimes, etc 31- `atproto/identity`: DID and handle resolution 32- `atproto/data`: helpers for atproto data as JSON or CBOR with unknown schema 33- `atproto/lexicon`: lexicon validation of generic data 34- `atproto/repo`: repo and MST implementation 35- `automod`: moderation and anti-spam rules engine 36- `bgs`: relay server implementation for crawling, etc (for bigsky implementation) 37- `carstore`: library for storing repo data in CAR files on disk, plus a metadata SQL db 38- `events`: types, codegen CBOR helpers, and persistence for event feeds 39- `indexer`: aggregator, handling like counts etc in SQL database 40- `lex`: implements codegen for Lexicons (!) 41- `models`: database types/models/schemas; shared in several places 42- `mst`: merkle search tree implementation 43- `notifs`: helpers for notification objects (hydration, etc) 44- `pds`: PDS server implementation 45- `plc`: implementation of a *fake* PLC server (not persisted), and a PLC client 46- `repo`: implements atproto repo on top of a blockstore. CBOR types 47- `repomgr`: wraps many repos with a single carstore backend. handles events, locking 48- `search`: search server implementation 49- `testing`: integration tests; testing helpers 50- `util`: a few common definitions (may rename) 51- `xrpc`: XRPC client (not server) helpers 52 53 54## Jargon 55 56- Relay: service which crawls/consumes content from "all" PDSs and re-broadcasts as a firehose 57- BGS: Big Graph Service, previous name for what is now "Relay" 58- PDS: Personal Data Server (or Service), which stores user atproto repositories and acts as a user agent in the network 59- CLI: Command Line Tool 60- CBOR: a binary serialization format, smilar to JSON 61- PLC: "placeholder" DID provider, see <https://web.plc.directory> 62- DID: Decentralized IDentifier, a flexible W3C specification for persistent identifiers in URI form (eg, `did:plc:abcd1234`) 63- XRPC: atproto convention for HTTP GET and POST endpoints specified by namespaced Lexicon schemas 64- CAR: simple file format for storing binary content-addressed blocks/blobs, sort of like .tar files 65- CID: content identifier for binary blobs, basically a flexible encoding of hash values 66- MST: Merkle Search Tree, a key/value map data structure using content addressed nodes 67 68 69## Lexicon and CBOR code generation 70 71`gen/main.go` has a list of types internal to packages in this repo which need CBOR helper codegen. If you edit those types, or update the listed types/packages, re-run codegen like: 72 73 # make sure everything can build cleanly first 74 make build 75 76 # then generate 77 go run ./gen 78 79To run codegen for new or updated Lexicons, using lexgen, first place (or git checkout) the JSON lexicon files at `../atproto/`. Then, in *this* repository (indigo), run commands like: 80 81 go run ./cmd/lexgen/ --package bsky --prefix app.bsky --outdir api/bsky ../atproto/lexicons/app/bsky/ 82 go run ./cmd/lexgen/ --package atproto --prefix com.atproto --outdir api/atproto ../atproto/lexicons/com/atproto/ 83 84You may want to delete all the codegen files before re-generating, to detect deleted files. 85 86It can require some manual munging between the lexgen step and a later `go run ./gen` to make sure things compile at least temporarily; otherwise the `gen` will not run. In some cases, you might also need to add new types to `./gen/main.go`. 87 88To generate server stubs and handlers, push them in a temporary directory first, then merge changes in to the actual PDS code: 89 90 mkdir tmppds 91 go run ./cmd/lexgen/ --package pds --gen-server --types-import com.atproto:github.com/bluesky-social/indigo/api/atproto --types-import app.bsky:github.com/bluesky-social/indigo/api/bsky --outdir tmppds --gen-handlers ../atproto/lexicons 92 93 94## Tips and Tricks 95 96When debugging websocket streams, the `websocat` tool (rust) can be helpful. CBOR binary is sort of mangled in to text by default. Eg: 97 98 # consume repo events from PDS 99 websocat ws://localhost:4989/events 100 101 # consume repo events from Relay 102 websocat ws://localhost:2470/events 103 104Send the Relay a ding-dong: 105 106 # tell Relay to consume from PDS 107 http --json post localhost:2470/add-target host="localhost:4989" 108 109Set the log level to be more verbose, using an env variable: 110 111 GOLOG_LOG_LEVEL=info go run ./cmd/pds 112 113 114## `gosky` basic usage 115 116Running against local typescript PDS in `dev-env` mode: 117 118 # as "alice" user 119 go run ./cmd/gosky/ --pds-host http://localhost:2583 account create-session alice.test hunter2 > bsky.auth 120 121The `bsky.auth` file is the default place that `gosky` and other client commands will look for auth info. 122 123 124## Integrated Development 125 126Sometimes it is helpful to run a PLC, PDS, Relay, and other components, all locally on your laptop, across languages. This section describes one setup for this. 127 128First, you need PostgreSQL running locally. This could be via docker, or the following commands assume some kind of debian/ubuntu setup with a postgres server package installed and running. 129 130Create a user and databases for PLC+PDS: 131 132 # use 'yksb' as weak default password for local-only dev 133 sudo -u postgres createuser -P -s bsky 134 135 sudo -u postgres createdb plc_dev -O bsky 136 sudo -u postgres createdb pds_dev -O bsky 137 138If you end up needing to wipe the databases: 139 140 sudo -u postgres dropdb plc_dev 141 sudo -u postgres dropdb pds_dev 142 143Checkout the `did-method-plc` repo in on terminal and run: 144 145 make run-dev-plc 146 147Checkout the `atproto` repo in another terminal and run: 148 149 make run-dev-pds 150 151In this repo (indigo), start a Relay, in two separate terminals: 152 153 make run-dev-relay 154 155In a final terminal, run fakermaker to inject data into the system: 156 157 # setup and create initial accounts; 100 by default 158 mkdir data/fakermaker/ 159 export GOLOG_LOG_LEVEL=info 160 go run ./cmd/fakermaker/ gen-accounts > data/fakermaker/accounts.json 161 162 # create or update profiles for all the accounts 163 go run ./cmd/fakermaker/ gen-profiles 164 165 # create follow graph between accounts 166 go run ./cmd/fakermaker/ gen-graph 167 168 # create posts, including mentions and image uploads 169 go run ./cmd/fakermaker/ gen-posts 170 171 # create more interactions, such as likes, between accounts 172 go run ./cmd/fakermaker/ gen-interactions 173 174 # lastly, read-only queries, including timelines, notifications, and post threads 175 go run ./cmd/fakermaker/ run-browsing