Parakeet#

Parakeet is a Bluesky AppServer aiming to implement most of the functionality required to support the Bluesky client. Notably not implemented is a CDN.

Status and Roadmap#

Most common functionality works. Future work is tracked in issues and on the Skyboard, but the highlights are below. Help would be highly appreciated.

Notifications
Search
Control Panel

The Code#

Parakeet is implemented in Rust, using Postgres as a database, Redis for caching and queue processing, RocksDB for aggregation, and Diesel for migrations and querying.

This repo is one big Rust workspace, containing nearly everything required to run and support the AppServer.

Packages#

consumer: Relay indexer, Label consumer, Backfiller. Takes raw records in from repos and stores them.
dataloader-rs: a vendored fork of https://github.com/cksac/dataloader-rs, with some tweaks to fit caching requirements.
lexica: Rust types for the relevant lexicons[sic] for Bluesky.
parakeet: The core AppServer code. Using Axum and Diesel.
parakeet-db: Database types and models, also the Diesel schema.
parakeet-index: Stats aggregator based on RocksDB. Uses gRPC with tonic.

There is also a dependency on a fork of jsonwebtoken until upstream supports ES256K.

Running#

The "most supported" way is currently Nix - see below for more info. The Docker images may not build currently!.

Prebuilt docker images are published (semi) automatically by GitLab CI at https://gitlab.com/parakeet-social/parakeet. Use registry.gitlab.com/parakeet-social/parakeet/[package]:[branch] in your docker-compose.yml. There is currently no versioning until the project is more stable (sorry). You can also just build with cargo.

To run, you'll need Postgres (version 16 or higher), Redis or a Redis-like, consumer, parakeet, and parakeet-index.

Nix#

A Nix Flake is provided, so you can add git+https://tangled.org/parakeet.at/parakeet to your flake.nix inputs.

Packages parakeet-appview (crates/parakeet), parakeet-consumer (crates/consumer), and parakeet-index (crates/parakeet-index) are provided, and also systemd services for NixOS. You can see an example in Mia's dotfiles.

The flake is configured to set up a basic environment using nix develop (currently only cargo and friends, not postgres).

Configuring#

There are quite a lot of environment variables, although sensible defaults are provided when possible. Variables are prefixed by PK, PKC, or PKI depending on if they're used in Parakeet, Consumer, or parakeet-index, respectively. Some are common to two or three parts, and are marked accordingly.

Variable	Default	Description
(PK/PKC)_INDEX_URI	n/a	Required. URI of the parakeet-index instance in format `[host]:[port]`
(PK/PKC)_REDIS_URI	n/a	Required. URI of Redis (or compatible) in format `redis://[host]:[port]`
(PK/PKC)_PLC_DIRECTORY	`https://plc.directory`	Optional. A PLC mirror or different instance to use when resolving did:plc.
PKC_DATABASE__URL	n/a	Required. URI of Postgres in format `postgres://[user]:[pass]@[host]:[port]/[db]`
PKC_UA_CONTACT	n/a	Recommended. Some contact details (email / bluesky handle / website) to add to User-Agent.
PKC_LABEL_SOURCE	n/a	Required if consuming Labels. A labeler or label relay to consume.
PKC_RESUME_PATH	n/a	Required if consuming relay or label firehose. Where to store the cursor data.
PKC_INDEXER__RELAY_SOURCE	n/a	Required if consuming relay. Relay to consume from.
PKC_INDEXER__HISTORY_MODE	n/a	Required if consuming relay. `backfill_history` or `realtime` depending on if you plan to backfill when consuming record data from a relay.
PKC_INDEXER__INDEXER_WORKERS	4	How many workers to spread indexing work between. 4 or 6 usually works depending on load. Ensure you have enough DB connections available.
PKC_INDEXER__START_COMMIT_SEQ	n/a	Optionally, the relay sequence to start consuming from. Overridden by the data in PKC_RESUME_PATH, so clear that first if you reset.
PKC_INDEXER__SKIP_HANDLE_VALIDATION	false	Should the indexer SKIP validating handles from `#identity` events.
PKC_INDEXER__REQUEST_BACKFILL	false	Should the indexer request backfill when relevant. Only when `backfill_history` set. You likely want TRUE, unless you're manually controlling backfill queues.
PKC_BACKFILL__WORKERS	4	How many workers to use when backfilling into the DB. Ensure you have enough DB connections available as one is created per worker.
PKC_BACKFILL__SKIP_AGGREGATION	false	Whether to skip sending aggregation to parakeet-index. Does not remove the index requirement. Useful when developing.
PKC_BACKFILL__DOWNLOAD_WORKERS	25	How many workers to use to download repos for backfilling.
PKC_BACKFILL__DOWNLOAD_BUFFER	25000	How many repos to download and queue.
PKC_BACKFILL__DOWNLOAD_TMP_DIR	n/a	Where to download repos to. Ensure there is enough space.
PKC_METRICS_PORT	9000	Port to bind to for Prometheus metrics in Consumer
(PK/PKI)_SERVER__BIND_ADDRESS	`0.0.0.0`	Address for the server to bind to. For index outside of docker, you probably want loopback as there is no auth.
(PK/PKI)_SERVER__PORT	PK: 6000, PKI: 6001	Port for the server to bind to.
(PK/PKI)_DATABASE_URL	n/a	Required. URI of Postgres in format `postgres://[user]:[pass]@[host]:[port]/[db]`
PK_SERVICE__DID	n/a	DID for the AppServer in did:web. (did:plc is possible but untested)
PK_SERVICE__PUBLIC_KEY	n/a	Public key for the AppServer. Unsure if actually used, but may be required by PDS.
PK_SERVICE__ENDPOINT	n/a	HTTPS publicly accessible endpoint for the AppServer.
PK_TRUSTED_VERIFIERS	n/a	Optionally, trusted verifiers to use. For many, join with `,`.
PK_CDN__BASE	`https://cdn.bsky.app`	Optionally, base URL for a Bluesky compatible CDN
PK_CDN__VIDEO_BASE	`https://video.bsky.app`	Optionally, base URL for a Bluesky compatible video CDN
PK_DID_ALLOWLIST	n/a	Optional. If set, controls which DIDs can access the AppServer. For many, join with `,`
PK_MIGRATE	false	Set to TRUE to run database migrations automatically on start.
PKI_INDEX_DB_PATH	n/a	Required. Location to store the index database.

Clone this repository