Lycan 🐺#

A service which downloads and indexes the Bluesky posts you've liked, reposted, quoted or bookmarked, and allows you to search in that archive.

How it works#

Lycan is kind of like a tiny specialized AppView, which only indexes some specific things from some specific people. To avoid having to keep a full-network AppView, it only indexes posts and likes on demand from people who request to use it. So the first time you want to use it, you need to ask it to run an import process, which can take anything between a few minutes and an hour, depending on how much data there is to download. After that, new likes are being indexed live from the firehose.

At the moment, Lycan indexes four types of content:

posts you've liked
posts you've reposted
posts you've quoted
your old-style bookmarks (using the 📌 emoji method)

New bookmarks are private data, so at the moment they can't be imported until support for OAuth is added.

Lycan is written in Ruby, using Sinatra and ActiveRecord, with Postgres as the database. The official instance runs at lycan.feeds.blue (this service only implements an XRPC API – the UI is implemented as part of Skythread).

The service consists of three separate components:

a firehose client, which streams events from a relay/Jetstream and saves new data for the users whose data is/has been imported
a background worker, which runs the import process
an HTTP server, which serves the XRPC endpoints (currently there are 3: startImport, getImportStatus and searchPosts, plus a did.json); all the endpoints require service authentication through PDS proxying

Setting up on localhost#

This app should run on any somewhat recent version of Ruby, though of course it's recommended to run one that's still getting maintenance updates, ideally the latest one. It's also recommended to install it with YJIT support, and on Linux also with jemalloc. You will probably need to have some familiarity with the Ruby ecosystem in order to set it up and run it.

A Postgres database is also required (again, any non-ancient version should work).

Download or clone the repository, then install the dependencies:

bundle install

Next, create the database – the configuration is defined in config/database.yml, for development it's lycan_development. Create it either manually, or with a rake task:

bundle exec rake db:create

Then, run the migrations:

bundle exec rake db:migrate

To run an import, you will need to run three separate processes, probably in separate terminal tabs:

the firehose client, bin/firehose
the background worker, bin/worker
the Sinatra HTTP server, bin/server

The UI can be accessed through Skythread, either on the official site on skythread.mackuba.eu, or a copy you can download from the repo. Log in and open "Archive search" from the account menu – but importantly, to use the localhost Lycan instance, add &lycan=local to the URL.

You should then be able to start an import from there, and see the worker process printing some logs as it starts to download the data. (The firehose process needs to be running too, because the import job needs to pass through it first.)

Configuration#

There's a few things you can configure through ENV variables:

RELAY_HOST – hostname of the relay to use for the firehose (default: bsky.network)
JETSTREAM_HOST – alternatively, instead of RELAY_HOST, set this to a hostname of a Jetstream instance
FIREHOSE_USER_AGENT – when running in production, it's recommended that you set this to some name that identifies who is running the service
APPVIEW_HOST – hostname of the AppView used to download posts (default: public.api.bsky.app)
SERVER_HOSTNAME – hostname of the server on which you're running the service in production

Rake tasks#

Some Rake tasks that might be useful:

bundle exec rake enqueue_user DID=did:plc:qweqwe

request an import of the given account (to be handled by firehose + worker)

bundle exec rake import_user DID=did:plc:qweqwe COLLECTION=likes/reposts/posts/all

run a complete import synchronously

bundle exec rake process_posts

process all previously queued and unfinished or failed items

Running in production#

This will probably heavily depend on where and how you prefer to run it, I'm using a Capistrano deploy config in config/deploy.rb to deploy to a VPS at lycan.feeds.blue. To use something like Docker or a service like Fly or Railway, you'll need to adapt the config for your specific setup.

On the server, you need to make sure that the firehose & worker processes are always running and are restarted if necessary. One option to do this (which I'm using) may be writing a systemd service config file and adding it to /etc/systemd/system. To run the HTTP server, you need Nginx/Caddy/Apache and a Ruby app server – my recommendation is Nginx with either Passenger (runs your app automatically from Nginx) or something like Puma (needs to be started by e.g. systemd like the firehose).

Credits#

The code is available under the terms of the zlib license (permissive, similar to MIT).

Bug reports and pull requests are welcome 😎

Clone this repository