Don't forget to lycansubscribe
1# Lycan 🐺
2
3A service which downloads and indexes the Bluesky posts you've liked, reposted, quoted or bookmarked, and allows you to search in that archive.
4
5
6## How it works
7
8Lycan is kind of like a tiny specialized AppView, which only indexes some specific things from some specific people. To avoid having to keep a full-network AppView, it only indexes posts and likes on demand from people who request to use it. So the first time you want to use it, you need to ask it to run an import process, which can take anything between a few minutes and an hour, depending on how much data there is to download. After that, new likes are being indexed live from the firehose.
9
10At the moment, Lycan indexes four types of content:
11
12- posts you've liked
13- posts you've reposted
14- posts you've quoted
15- your old-style bookmarks (using the 📌 emoji method)
16
17New bookmarks are private data, so at the moment they can't be imported until support for OAuth is added.
18
19Lycan is written in Ruby, using Sinatra and ActiveRecord, with Postgres as the database. The official instance runs at [lycan.feeds.blue](https://lycan.feeds.blue) (this service only implements an XRPC API – the UI is implemented as part of [Skythread](https://skythread.mackuba.eu)).
20
21The service consists of three separate components:
22
23- a **firehose client**, which streams events from a relay/Jetstream and saves new data for the users whose data is/has been imported
24- a **background worker**, which runs the import process
25- an **HTTP server**, which serves the XRPC endpoints (currently there are 3: `startImport`, `getImportStatus` and `searchPosts`, plus a `did.json`); all the endpoints require service authentication through PDS proxying
26
27
28## Setting up on localhost
29
30This app should run on any somewhat recent version of Ruby, though of course it's recommended to run one that's still getting maintenance updates, ideally the latest one. It's also recommended to install it with [YJIT support](https://shopify.engineering/ruby-yjit-is-production-ready), and on Linux also with [jemalloc](https://scalingo.com/blog/improve-ruby-application-memory-jemalloc). You will probably need to have some familiarity with the Ruby ecosystem in order to set it up and run it.
31
32A Postgres database is also required (again, any non-ancient version should work).
33
34Download or clone the repository, then install the dependencies:
35
36```
37bundle install
38```
39
40Next, create the database – the configuration is defined in [`config/database.yml`](config/database.yml), for development it's `lycan_development`. Create it either manually, or with a rake task:
41
42```
43bundle exec rake db:create
44```
45
46Then, run the migrations:
47
48```
49bundle exec rake db:migrate
50```
51
52To run an import, you will need to run three separate processes, probably in separate terminal tabs:
53
541) the firehose client, [`bin/firehose`](bin/firehose)
552) the background worker, [`bin/worker`](bin/worker)
563) the Sinatra HTTP server, [`bin/server`](bin/server)
57
58The UI can be accessed through Skythread, either on the official site on [skythread.mackuba.eu](https://skythread.mackuba.eu), or a copy you can download [from the repo](https://tangled.org/mackuba.eu/skythread). Log in and open "[Archive search](https://skythread.mackuba.eu/?page=search&mode=likes)" from the account menu – but importantly, to use the `localhost` Lycan instance, add `&lycan=local` to the URL.
59
60You should then be able to start an import from there, and see the worker process printing some logs as it starts to download the data. (The firehose process needs to be running too, because the import job needs to pass through it first.)
61
62
63## Configuration
64
65There's a few things you can configure through ENV variables:
66
67- `RELAY_HOST` – hostname of the relay to use for the firehose (default: `bsky.network`)
68- `JETSTREAM_HOST` – alternatively, instead of `RELAY_HOST`, set this to a hostname of a [Jetstream](https://github.com/bluesky-social/jetstream) instance
69- `FIREHOSE_USER_AGENT` – when running in production, it's recommended that you set this to some name that identifies who is running the service
70- `APPVIEW_HOST` – hostname of the AppView used to download posts (default: `public.api.bsky.app`)
71- `SERVER_HOSTNAME` – hostname of the server on which you're running the service in production
72
73
74## Rake tasks
75
76Some Rake tasks that might be useful:
77
78```
79bundle exec rake enqueue_user DID=did:plc:qweqwe
80```
81
82- request an import of the given account (to be handled by firehose + worker)
83
84```
85bundle exec rake import_user DID=did:plc:qweqwe COLLECTION=likes/reposts/posts/all
86```
87
88- run a complete import synchronously
89
90```
91bundle exec rake process_posts
92```
93
94- process all previously queued and unfinished or failed items
95
96
97## Running in production
98
99This will probably heavily depend on where and how you prefer to run it, I'm using a Capistrano deploy config in [`config/deploy.rb`](config/deploy.rb) to deploy to a VPS at [lycan.feeds.blue](https://lycan.feeds.blue). To use something like Docker or a service like Fly or Railway, you'll need to adapt the config for your specific setup.
100
101On the server, you need to make sure that the firehose & worker processes are always running and are restarted if necessary. One option to do this (which I'm using) may be writing a `systemd` service config file and adding it to `/etc/systemd/system`. To run the HTTP server, you need Nginx/Caddy/Apache and a Ruby app server – my recommendation is Nginx with either Passenger (runs your app automatically from Nginx) or something like Puma (needs to be started by e.g. systemd like the firehose).
102
103
104## Credits
105
106Copyright © 2025 Kuba Suder ([@mackuba.eu](https://bsky.app/profile/did:plc:oio4hkxaop4ao4wz2pp3f4cr)).
107
108The code is available under the terms of the [zlib license](https://choosealicense.com/licenses/zlib/) (permissive, similar to MIT).
109
110Bug reports and pull requests are welcome 😎