Parakeet is a Rust-based Bluesky AppServer aiming to implement most of the functionality required to support the Bluesky client
appview atproto bluesky rust appserver
Rust 90.8%
PLpgSQL 2.3%
Nix 1.8%
Dockerfile 0.4%
Just 0.3%
Other 4.5%
220 4 0

Clone this repository

https://tangled.org/parakeet.at/parakeet
git@tangled.org:parakeet.at/parakeet

For self-hosted knots, clone URLs may differ based on your setup.

README.md

Parakeet#

Parakeet is a Bluesky AppServer aiming to implement most of the functionality required to support the Bluesky client. Notably not implemented is a CDN.

Status and Roadmap#

Most common functionality works. Future work is tracked in issues and on the Skyboard, but the highlights are below. Help would be highly appreciated.

  • Notifications
  • Search
  • Control Panel

The Code#

Parakeet is implemented in Rust, using Postgres as a database, Redis for caching and queue processing, RocksDB for aggregation, and Diesel for migrations and querying.

This repo is one big Rust workspace, containing nearly everything required to run and support the AppServer.

Packages#

  • consumer: Relay indexer, Label consumer, Backfiller. Takes raw records in from repos and stores them.
  • dataloader-rs: a vendored fork of https://github.com/cksac/dataloader-rs, with some tweaks to fit caching requirements.
  • lexica: Rust types for the relevant lexicons[sic] for Bluesky.
  • parakeet: The core AppServer code. Using Axum and Diesel.
  • parakeet-db: Database types and models, also the Diesel schema.
  • parakeet-index: Stats aggregator based on RocksDB. Uses gRPC with tonic.

There is also a dependency on a fork of jsonwebtoken until upstream supports ES256K.

Running#

The "most supported" way is currently Nix - see below for more info. The Docker images may not build currently!.

Prebuilt docker images are published (semi) automatically by GitLab CI at https://gitlab.com/parakeet-social/parakeet. Use registry.gitlab.com/parakeet-social/parakeet/[package]:[branch] in your docker-compose.yml. There is currently no versioning until the project is more stable (sorry). You can also just build with cargo.

To run, you'll need Postgres (version 16 or higher), Redis or a Redis-like, consumer, parakeet, and parakeet-index.

Nix#

A Nix Flake is provided, so you can add git+https://tangled.org/parakeet.at/parakeet to your flake.nix inputs.

Packages parakeet-appview (crates/parakeet), parakeet-consumer (crates/consumer), and parakeet-index (crates/parakeet-index) are provided, and also systemd services for NixOS. You can see an example in Mia's dotfiles.

The flake is configured to set up a basic environment using nix develop (currently only cargo and friends, not postgres).

Configuring#

There are quite a lot of environment variables, although sensible defaults are provided when possible. Variables are prefixed by PK, PKC, or PKI depending on if they're used in Parakeet, Consumer, or parakeet-index, respectively. Some are common to two or three parts, and are marked accordingly.

Variable Default Description
(PK/PKC)_INDEX_URI n/a Required. URI of the parakeet-index instance in format [host]:[port]
(PK/PKC)_REDIS_URI n/a Required. URI of Redis (or compatible) in format redis://[host]:[port]
(PK/PKC)_PLC_DIRECTORY https://plc.directory Optional. A PLC mirror or different instance to use when resolving did:plc.
PKC_DATABASE__URL n/a Required. URI of Postgres in format postgres://[user]:[pass]@[host]:[port]/[db]
PKC_UA_CONTACT n/a Recommended. Some contact details (email / bluesky handle / website) to add to User-Agent.
PKC_LABEL_SOURCE n/a Required if consuming Labels. A labeler or label relay to consume.
PKC_RESUME_PATH n/a Required if consuming relay or label firehose. Where to store the cursor data.
PKC_INDEXER__RELAY_SOURCE n/a Required if consuming relay. Relay to consume from.
PKC_INDEXER__HISTORY_MODE n/a Required if consuming relay. backfill_history or realtime depending on if you plan to backfill when consuming record data from a relay.
PKC_INDEXER__INDEXER_WORKERS 4 How many workers to spread indexing work between. 4 or 6 usually works depending on load. Ensure you have enough DB connections available.
PKC_INDEXER__START_COMMIT_SEQ n/a Optionally, the relay sequence to start consuming from. Overridden by the data in PKC_RESUME_PATH, so clear that first if you reset.
PKC_INDEXER__SKIP_HANDLE_VALIDATION false Should the indexer SKIP validating handles from #identity events.
PKC_INDEXER__REQUEST_BACKFILL false Should the indexer request backfill when relevant. Only when backfill_history set. You likely want TRUE, unless you're manually controlling backfill queues.
PKC_BACKFILL__WORKERS 4 How many workers to use when backfilling into the DB. Ensure you have enough DB connections available as one is created per worker.
PKC_BACKFILL__SKIP_AGGREGATION false Whether to skip sending aggregation to parakeet-index. Does not remove the index requirement. Useful when developing.
PKC_BACKFILL__DOWNLOAD_WORKERS 25 How many workers to use to download repos for backfilling.
PKC_BACKFILL__DOWNLOAD_BUFFER 25000 How many repos to download and queue.
PKC_BACKFILL__DOWNLOAD_TMP_DIR n/a Where to download repos to. Ensure there is enough space.
PKC_METRICS_PORT 9000 Port to bind to for Prometheus metrics in Consumer
(PK/PKI)_SERVER__BIND_ADDRESS 0.0.0.0 Address for the server to bind to. For index outside of docker, you probably want loopback as there is no auth.
(PK/PKI)_SERVER__PORT PK: 6000, PKI: 6001 Port for the server to bind to.
(PK/PKI)_DATABASE_URL n/a Required. URI of Postgres in format postgres://[user]:[pass]@[host]:[port]/[db]
PK_SERVICE__DID n/a DID for the AppServer in did:web. (did:plc is possible but untested)
PK_SERVICE__PUBLIC_KEY n/a Public key for the AppServer. Unsure if actually used, but may be required by PDS.
PK_SERVICE__ENDPOINT n/a HTTPS publicly accessible endpoint for the AppServer.
PK_TRUSTED_VERIFIERS n/a Optionally, trusted verifiers to use. For many, join with ,.
PK_CDN__BASE https://cdn.bsky.app Optionally, base URL for a Bluesky compatible CDN
PK_CDN__VIDEO_BASE https://video.bsky.app Optionally, base URL for a Bluesky compatible video CDN
PK_DID_ALLOWLIST n/a Optional. If set, controls which DIDs can access the AppServer. For many, join with ,
PK_MIGRATE false Set to TRUE to run database migrations automatically on start.
PKI_INDEX_DB_PATH n/a Required. Location to store the index database.