README.md at master · mackuba.eu/lycan

mackuba.eu / lycan
Don't forget to lycansubscribe
lycan / README.md
at master 5.9 kB view raw view rendered
  1# Lycan 🐺
  2
  3A service which downloads and indexes the Bluesky posts you've liked, reposted, quoted or bookmarked, and allows you to search in that archive.
  4
  5
  6## How it works
  7
  8Lycan is kind of like a tiny specialized AppView, which only indexes some specific things from some specific people. To avoid having to keep a full-network AppView, it only indexes posts and likes on demand from people who request to use it. So the first time you want to use it, you need to ask it to run an import process, which can take anything between a few minutes and an hour, depending on how much data there is to download. After that, new likes are being indexed live from the firehose.
  9
 10At the moment, Lycan indexes four types of content:
 11
 12- posts you've liked
 13- posts you've reposted
 14- posts you've quoted
 15- your old-style bookmarks (using the 📌 emoji method)
 16
 17New bookmarks are private data, so at the moment they can't be imported until support for OAuth is added.
 18
 19Lycan is written in Ruby, using Sinatra and ActiveRecord, with Postgres as the database. The official instance runs at [lycan.feeds.blue](https://lycan.feeds.blue) (this service only implements an XRPC API – the UI is implemented as part of [Skythread](https://skythread.mackuba.eu)).
 20
 21The service consists of three separate components:
 22
 23- a **firehose client**, which streams events from a relay/Jetstream and saves new data for the users whose data is/has been imported
 24- a **background worker**, which runs the import process
 25- an **HTTP server**, which serves the XRPC endpoints (currently there are 3: `startImport`, `getImportStatus` and `searchPosts`, plus a `did.json`); all the endpoints require service authentication through PDS proxying
 26
 27
 28## Setting up on localhost
 29
 30This app should run on any somewhat recent version of Ruby, though of course it's recommended to run one that's still getting maintenance updates, ideally the latest one. It's also recommended to install it with [YJIT support](https://shopify.engineering/ruby-yjit-is-production-ready), and on Linux also with [jemalloc](https://scalingo.com/blog/improve-ruby-application-memory-jemalloc). You will probably need to have some familiarity with the Ruby ecosystem in order to set it up and run it.
 31
 32A Postgres database is also required (again, any non-ancient version should work).
 33
 34Download or clone the repository, then install the dependencies:
 35
 36```
 37bundle install
 38```
 39
 40Next, create the database – the configuration is defined in [`config/database.yml`](config/database.yml), for development it's `lycan_development`. Create it either manually, or with a rake task:
 41
 42```
 43bundle exec rake db:create
 44```
 45
 46Then, run the migrations:
 47
 48```
 49bundle exec rake db:migrate
 50```
 51
 52To run an import, you will need to run three separate processes, probably in separate terminal tabs:
 53
 541) the firehose client, [`bin/firehose`](bin/firehose)
 552) the background worker, [`bin/worker`](bin/worker)
 563) the Sinatra HTTP server, [`bin/server`](bin/server)
 57
 58The UI can be accessed through Skythread, either on the official site on [skythread.mackuba.eu](https://skythread.mackuba.eu), or a copy you can download [from the repo](https://tangled.org/mackuba.eu/skythread). Log in and open "[Archive search](https://skythread.mackuba.eu/?page=search&mode=likes)" from the account menu – but importantly, to use the `localhost` Lycan instance, add `&lycan=local` to the URL.
 59
 60You should then be able to start an import from there, and see the worker process printing some logs as it starts to download the data. (The firehose process needs to be running too, because the import job needs to pass through it first.)
 61
 62
 63## Configuration
 64
 65There's a few things you can configure through ENV variables:
 66
 67- `RELAY_HOST` – hostname of the relay to use for the firehose (default: `bsky.network`)
 68- `JETSTREAM_HOST` – alternatively, instead of `RELAY_HOST`, set this to a hostname of a [Jetstream](https://github.com/bluesky-social/jetstream) instance
 69- `FIREHOSE_USER_AGENT` – when running in production, it's recommended that you set this to some name that identifies who is running the service
 70- `APPVIEW_HOST` – hostname of the AppView used to download posts (default: `public.api.bsky.app`)
 71- `SERVER_HOSTNAME` – hostname of the server on which you're running the service in production
 72
 73
 74## Rake tasks
 75
 76Some Rake tasks that might be useful:
 77
 78```
 79bundle exec rake enqueue_user DID=did:plc:qweqwe
 80```
 81
 82- request an import of the given account (to be handled by firehose + worker)
 83
 84```
 85bundle exec rake import_user DID=did:plc:qweqwe COLLECTION=likes/reposts/posts/all
 86```
 87
 88- run a complete import synchronously
 89
 90```
 91bundle exec rake process_posts
 92```
 93
 94- process all previously queued and unfinished or failed items
 95
 96
 97## Running in production
 98
 99This will probably heavily depend on where and how you prefer to run it, I'm using a Capistrano deploy config in [`config/deploy.rb`](config/deploy.rb) to deploy to a VPS at [lycan.feeds.blue](https://lycan.feeds.blue). To use something like Docker or a service like Fly or Railway, you'll need to adapt the config for your specific setup.
100
101On the server, you need to make sure that the firehose & worker processes are always running and are restarted if necessary. One option to do this (which I'm using) may be writing a `systemd` service config file and adding it to `/etc/systemd/system`. To run the HTTP server, you need Nginx/Caddy/Apache and a Ruby app server – my recommendation is Nginx with either Passenger (runs your app automatically from Nginx) or something like Puma (needs to be started by e.g. systemd like the firehose).
102
103
104## Credits
105
106Copyright © 2025 Kuba Suder ([@mackuba.eu](https://bsky.app/profile/did:plc:oio4hkxaop4ao4wz2pp3f4cr)).
107
108The code is available under the terms of the [zlib license](https://choosealicense.com/licenses/zlib/) (permissive, similar to MIT).
109
110Bug reports and pull requests are welcome 😎