# Lycan 🐺

A service which downloads and indexes the Bluesky posts you've liked, reposted, quoted or bookmarked, and allows you to search in that archive.


## How it works

Lycan is kind of like a tiny specialized AppView, which only indexes some specific things from some specific people. To avoid having to keep a full-network AppView, it only indexes posts and likes on demand from people who request to use it. So the first time you want to use it, you need to ask it to run an import process, which can take anything between a few minutes and an hour, depending on how much data there is to download. After that, new likes are being indexed live from the firehose.

At the moment, Lycan indexes four types of content:

- posts you've liked
- posts you've reposted
- posts you've quoted
- your old-style bookmarks (using the 📌 emoji method)

New bookmarks are private data, so at the moment they can't be imported until support for OAuth is added.

Lycan is written in Ruby, using Sinatra and ActiveRecord, with Postgres as the database. The official instance runs at [lycan.feeds.blue](https://lycan.feeds.blue) (this service only implements an XRPC API – the UI is implemented as part of [Skythread](https://skythread.mackuba.eu)).

The service consists of three separate components:

- a **firehose client**, which streams events from a relay/Jetstream and saves new data for the users whose data is/has been imported
- a **background worker**, which runs the import process
- an **HTTP server**, which serves the XRPC endpoints (currently there are 3: `startImport`, `getImportStatus` and `searchPosts`, plus a `did.json`); all the endpoints require service authentication through PDS proxying


## Setting up on localhost

This app should run on any somewhat recent version of Ruby, though of course it's recommended to run one that's still getting maintenance updates, ideally the latest one. It's also recommended to install it with [YJIT support](https://shopify.engineering/ruby-yjit-is-production-ready), and on Linux also with [jemalloc](https://scalingo.com/blog/improve-ruby-application-memory-jemalloc). You will probably need to have some familiarity with the Ruby ecosystem in order to set it up and run it.

A Postgres database is also required (again, any non-ancient version should work).

Download or clone the repository, then install the dependencies:

```
bundle install
```

Next, create the database – the configuration is defined in [`config/database.yml`](config/database.yml), for development it's `lycan_development`. Create it either manually, or with a rake task:

```
bundle exec rake db:create
```

Then, run the migrations:

```
bundle exec rake db:migrate
```

To run an import, you will need to run three separate processes, probably in separate terminal tabs:

1) the firehose client, [`bin/firehose`](bin/firehose)
2) the background worker, [`bin/worker`](bin/worker)
3) the Sinatra HTTP server, [`bin/server`](bin/server)

The UI can be accessed through Skythread, either on the official site on [skythread.mackuba.eu](https://skythread.mackuba.eu), or a copy you can download [from the repo](https://tangled.org/mackuba.eu/skythread). Log in and open "[Archive search](https://skythread.mackuba.eu/?page=search&mode=likes)" from the account menu – but importantly, to use the `localhost` Lycan instance, add `&lycan=local` to the URL.

You should then be able to start an import from there, and see the worker process printing some logs as it starts to download the data. (The firehose process needs to be running too, because the import job needs to pass through it first.)


## Configuration

There's a few things you can configure through ENV variables:

- `RELAY_HOST` – hostname of the relay to use for the firehose (default: `bsky.network`)
- `JETSTREAM_HOST` – alternatively, instead of `RELAY_HOST`, set this to a hostname of a [Jetstream](https://github.com/bluesky-social/jetstream) instance
- `FIREHOSE_USER_AGENT` – when running in production, it's recommended that you set this to some name that identifies who is running the service
- `APPVIEW_HOST` – hostname of the AppView used to download posts (default: `public.api.bsky.app`)
- `SERVER_HOSTNAME` – hostname of the server on which you're running the service in production


## Rake tasks

Some Rake tasks that might be useful:

```
bundle exec rake enqueue_user DID=did:plc:qweqwe
```

- request an import of the given account (to be handled by firehose + worker)

```
bundle exec rake import_user DID=did:plc:qweqwe COLLECTION=likes/reposts/posts/all
```

- run a complete import synchronously

```
bundle exec rake process_posts
```

- process all previously queued and unfinished or failed items


## Running in production

This will probably heavily depend on where and how you prefer to run it, I'm using a Capistrano deploy config in [`config/deploy.rb`](config/deploy.rb) to deploy to a VPS at [lycan.feeds.blue](https://lycan.feeds.blue). To use something like Docker or a service like Fly or Railway, you'll need to adapt the config for your specific setup.

On the server, you need to make sure that the firehose & worker processes are always running and are restarted if necessary. One option to do this (which I'm using) may be writing a `systemd` service config file and adding it to `/etc/systemd/system`. To run the HTTP server, you need Nginx/Caddy/Apache and a Ruby app server – my recommendation is Nginx with either Passenger (runs your app automatically from Nginx) or something like Puma (needs to be started by e.g. systemd like the firehose).


## Credits

Copyright © 2025 Kuba Suder ([@mackuba.eu](https://bsky.app/profile/did:plc:oio4hkxaop4ao4wz2pp3f4cr)).

The code is available under the terms of the [zlib license](https://choosealicense.com/licenses/zlib/) (permissive, similar to MIT).

Bug reports and pull requests are welcome 😎