+16
-16
Gemfile.lock
+16
-16
Gemfile.lock
···
7
GEM
8
remote: https://rubygems.org/
9
specs:
10
-
activemodel (7.2.2.2)
11
-
activesupport (= 7.2.2.2)
12
-
activerecord (7.2.2.2)
13
-
activemodel (= 7.2.2.2)
14
-
activesupport (= 7.2.2.2)
15
timeout (>= 0.4.0)
16
-
activesupport (7.2.2.2)
17
base64
18
benchmark (>= 0.3)
19
bigdecimal
···
41
concurrent-ruby (1.3.5)
42
connection_pool (2.5.4)
43
daemons (1.4.1)
44
-
date (3.4.1)
45
dotenv (3.1.8)
46
drb (2.2.3)
47
ed25519 (1.4.0)
48
-
erb (5.1.1)
49
eventmachine (1.2.7)
50
faye-websocket (0.12.0)
51
eventmachine (>= 0.12.0)
···
55
i18n (1.14.7)
56
concurrent-ruby (~> 1.0)
57
io-console (0.8.1)
58
-
irb (1.15.2)
59
pp (>= 0.6.0)
60
rdoc (>= 4.0.0)
61
reline (>= 0.4.2)
···
64
logger (1.7.0)
65
minisky (0.5.0)
66
base64 (~> 0.1)
67
-
minitest (5.26.0)
68
mustermann (3.0.4)
69
ruby2_keywords (~> 0.0.1)
70
net-scp (4.1.0)
···
87
psych (5.2.6)
88
date
89
stringio
90
-
rack (3.2.3)
91
rack-protection (4.2.1)
92
base64 (>= 0.1.0)
93
logger (>= 1.6.0)
···
98
rackup (2.2.1)
99
rack (>= 3)
100
rainbow (3.1.1)
101
-
rake (13.3.0)
102
-
rdoc (6.15.0)
103
erb
104
psych (>= 4.0.0)
105
tsort
106
-
reline (0.6.2)
107
io-console (~> 0.5)
108
ruby2_keywords (0.0.5)
109
securerandom (0.4.1)
···
123
cbor (~> 0.5, >= 0.5.9.6)
124
eventmachine (~> 1.2, >= 1.2.7)
125
faye-websocket (~> 0.12)
126
-
stringio (3.1.7)
127
thin (2.0.1)
128
daemons (~> 1.0, >= 1.0.9)
129
eventmachine (~> 1.0, >= 1.0.4)
130
logger
131
rack (>= 1, < 4)
132
tilt (2.6.1)
133
-
timeout (0.4.3)
134
tsort (0.2.0)
135
tzinfo (2.0.6)
136
concurrent-ruby (~> 1.0)
···
7
GEM
8
remote: https://rubygems.org/
9
specs:
10
+
activemodel (7.2.3)
11
+
activesupport (= 7.2.3)
12
+
activerecord (7.2.3)
13
+
activemodel (= 7.2.3)
14
+
activesupport (= 7.2.3)
15
timeout (>= 0.4.0)
16
+
activesupport (7.2.3)
17
base64
18
benchmark (>= 0.3)
19
bigdecimal
···
41
concurrent-ruby (1.3.5)
42
connection_pool (2.5.4)
43
daemons (1.4.1)
44
+
date (3.5.0)
45
dotenv (3.1.8)
46
drb (2.2.3)
47
ed25519 (1.4.0)
48
+
erb (6.0.0)
49
eventmachine (1.2.7)
50
faye-websocket (0.12.0)
51
eventmachine (>= 0.12.0)
···
55
i18n (1.14.7)
56
concurrent-ruby (~> 1.0)
57
io-console (0.8.1)
58
+
irb (1.15.3)
59
pp (>= 0.6.0)
60
rdoc (>= 4.0.0)
61
reline (>= 0.4.2)
···
64
logger (1.7.0)
65
minisky (0.5.0)
66
base64 (~> 0.1)
67
+
minitest (5.26.1)
68
mustermann (3.0.4)
69
ruby2_keywords (~> 0.0.1)
70
net-scp (4.1.0)
···
87
psych (5.2.6)
88
date
89
stringio
90
+
rack (3.2.4)
91
rack-protection (4.2.1)
92
base64 (>= 0.1.0)
93
logger (>= 1.6.0)
···
98
rackup (2.2.1)
99
rack (>= 3)
100
rainbow (3.1.1)
101
+
rake (13.3.1)
102
+
rdoc (6.15.1)
103
erb
104
psych (>= 4.0.0)
105
tsort
106
+
reline (0.6.3)
107
io-console (~> 0.5)
108
ruby2_keywords (0.0.5)
109
securerandom (0.4.1)
···
123
cbor (~> 0.5, >= 0.5.9.6)
124
eventmachine (~> 1.2, >= 1.2.7)
125
faye-websocket (~> 0.12)
126
+
stringio (3.1.8)
127
thin (2.0.1)
128
daemons (~> 1.0, >= 1.0.9)
129
eventmachine (~> 1.0, >= 1.0.4)
130
logger
131
rack (>= 1, < 4)
132
tilt (2.6.1)
133
+
timeout (0.4.4)
134
tsort (0.2.0)
135
tzinfo (2.0.6)
136
concurrent-ruby (~> 1.0)
+110
README.md
+110
README.md
···
···
1
+
# Lycan ๐บ
2
+
3
+
A service which downloads and indexes the Bluesky posts you've liked, reposted, quoted or bookmarked, and allows you to search in that archive.
4
+
5
+
6
+
## How it works
7
+
8
+
Lycan is kind of like a tiny specialized AppView, which only indexes some specific things from some specific people. To avoid having to keep a full-network AppView, it only indexes posts and likes on demand from people who request to use it. So the first time you want to use it, you need to ask it to run an import process, which can take anything between a few minutes and an hour, depending on how much data there is to download. After that, new likes are being indexed live from the firehose.
9
+
10
+
At the moment, Lycan indexes four types of content:
11
+
12
+
- posts you've liked
13
+
- posts you've reposted
14
+
- posts you've quoted
15
+
- your old-style bookmarks (using the ๐ emoji method)
16
+
17
+
New bookmarks are private data, so at the moment they can't be imported until support for OAuth is added.
18
+
19
+
Lycan is written in Ruby, using Sinatra and ActiveRecord, with Postgres as the database. The official instance runs at [lycan.feeds.blue](https://lycan.feeds.blue) (this service only implements an XRPC API โ the UI is implemented as part of [Skythread](https://skythread.mackuba.eu)).
20
+
21
+
The service consists of three separate components:
22
+
23
+
- a **firehose client**, which streams events from a relay/Jetstream and saves new data for the users whose data is/has been imported
24
+
- a **background worker**, which runs the import process
25
+
- an **HTTP server**, which serves the XRPC endpoints (currently there are 3: `startImport`, `getImportStatus` and `searchPosts`, plus a `did.json`); all the endpoints require service authentication through PDS proxying
26
+
27
+
28
+
## Setting up on localhost
29
+
30
+
This app should run on any somewhat recent version of Ruby, though of course it's recommended to run one that's still getting maintenance updates, ideally the latest one. It's also recommended to install it with [YJIT support](https://shopify.engineering/ruby-yjit-is-production-ready), and on Linux also with [jemalloc](https://scalingo.com/blog/improve-ruby-application-memory-jemalloc). You will probably need to have some familiarity with the Ruby ecosystem in order to set it up and run it.
31
+
32
+
A Postgres database is also required (again, any non-ancient version should work).
33
+
34
+
Download or clone the repository, then install the dependencies:
35
+
36
+
```
37
+
bundle install
38
+
```
39
+
40
+
Next, create the database โ the configuration is defined in [`config/database.yml`](config/database.yml), for development it's `lycan_development`. Create it either manually, or with a rake task:
41
+
42
+
```
43
+
bundle exec rake db:create
44
+
```
45
+
46
+
Then, run the migrations:
47
+
48
+
```
49
+
bundle exec rake db:migrate
50
+
```
51
+
52
+
To run an import, you will need to run three separate processes, probably in separate terminal tabs:
53
+
54
+
1) the firehose client, [`bin/firehose`](bin/firehose)
55
+
2) the background worker, [`bin/worker`](bin/worker)
56
+
3) the Sinatra HTTP server, [`bin/server`](bin/server)
57
+
58
+
The UI can be accessed through Skythread, either on the official site on [skythread.mackuba.eu](https://skythread.mackuba.eu), or a copy you can download [from the repo](https://tangled.org/mackuba.eu/skythread). Log in and open "[Archive search](https://skythread.mackuba.eu/?page=search&mode=likes)" from the account menu โ but importantly, to use the `localhost` Lycan instance, add `&lycan=local` to the URL.
59
+
60
+
You should then be able to start an import from there, and see the worker process printing some logs as it starts to download the data. (The firehose process needs to be running too, because the import job needs to pass through it first.)
61
+
62
+
63
+
## Configuration
64
+
65
+
There's a few things you can configure through ENV variables:
66
+
67
+
- `RELAY_HOST` โ hostname of the relay to use for the firehose (default: `bsky.network`)
68
+
- `JETSTREAM_HOST` โ alternatively, instead of `RELAY_HOST`, set this to a hostname of a [Jetstream](https://github.com/bluesky-social/jetstream) instance
69
+
- `FIREHOSE_USER_AGENT` โ when running in production, it's recommended that you set this to some name that identifies who is running the service
70
+
- `APPVIEW_HOST` โ hostname of the AppView used to download posts (default: `public.api.bsky.app`)
71
+
- `SERVER_HOSTNAME` โ hostname of the server on which you're running the service in production
72
+
73
+
74
+
## Rake tasks
75
+
76
+
Some Rake tasks that might be useful:
77
+
78
+
```
79
+
bundle exec rake enqueue_user DID=did:plc:qweqwe
80
+
```
81
+
82
+
- request an import of the given account (to be handled by firehose + worker)
83
+
84
+
```
85
+
bundle exec rake import_user DID=did:plc:qweqwe COLLECTION=likes/reposts/posts/all
86
+
```
87
+
88
+
- run a complete import synchronously
89
+
90
+
```
91
+
bundle exec rake process_posts
92
+
```
93
+
94
+
- process all previously queued and unfinished or failed items
95
+
96
+
97
+
## Running in production
98
+
99
+
This will probably heavily depend on where and how you prefer to run it, I'm using a Capistrano deploy config in [`config/deploy.rb`](config/deploy.rb) to deploy to a VPS at [lycan.feeds.blue](https://lycan.feeds.blue). To use something like Docker or a service like Fly or Railway, you'll need to adapt the config for your specific setup.
100
+
101
+
On the server, you need to make sure that the firehose & worker processes are always running and are restarted if necessary. One option to do this (which I'm using) may be writing a `systemd` service config file and adding it to `/etc/systemd/system`. To run the HTTP server, you need Nginx/Caddy/Apache and a Ruby app server โ my recommendation is Nginx with either Passenger (runs your app automatically from Nginx) or something like Puma (needs to be started by e.g. systemd like the firehose).
102
+
103
+
104
+
## Credits
105
+
106
+
Copyright ยฉ 2025 Kuba Suder ([@mackuba.eu](https://bsky.app/profile/did:plc:oio4hkxaop4ao4wz2pp3f4cr)).
107
+
108
+
The code is available under the terms of the [zlib license](https://choosealicense.com/licenses/zlib/) (permissive, similar to MIT).
109
+
110
+
Bug reports and pull requests are welcome ๐
+1
-1
app/importers/base_importer.rb
+1
-1
app/importers/base_importer.rb
+1
-1
app/importers/likes_importer.rb
+1
-1
app/importers/likes_importer.rb
+1
-1
app/importers/posts_importer.rb
+1
-1
app/importers/posts_importer.rb
+1
-1
app/importers/reposts_importer.rb
+1
-1
app/importers/reposts_importer.rb
+26
app/models/import.rb
+26
app/models/import.rb
···
9
validates_uniqueness_of :collection, scope: :user_id
10
11
scope :unfinished, -> { where('(started_from IS NOT NULL) OR (last_completed IS NULL)') }
12
+
13
+
IMPORT_END = Time.at(0)
14
+
15
+
def imported_until
16
+
return nil if cursor.nil? && last_completed.nil?
17
+
18
+
groups = case collection
19
+
when 'likes'
20
+
[:likes]
21
+
when 'reposts'
22
+
[:reposts]
23
+
when 'posts'
24
+
[:pins, :quotes]
25
+
end
26
+
27
+
newest_queued_items = groups.map { |g| user.send(g).where(queue: :import).order(:time).last }
28
+
newest_queued = newest_queued_items.compact.sort_by(&:time).last
29
+
30
+
if newest_queued
31
+
newest_queued.time
32
+
elsif fetched_until
33
+
fetched_until
34
+
else
35
+
IMPORT_END
36
+
end
37
+
end
38
end
+3
-16
app/models/user.rb
+3
-16
app/models/user.rb
···
50
end
51
52
def imported_until
53
-
return nil unless self.imports.exists?
54
-
55
-
oldest_imported_items = []
56
-
started = false
57
58
-
[:likes, :reposts, :pins, :quotes].each do |group|
59
-
if self.send(group).where(queue: :import).exists?
60
-
oldest_imported_items << self.send(group).where(queue: nil).order(:time).first
61
-
end
62
-
end
63
-
64
-
earliest_oldest = oldest_imported_items.compact.sort_by(&:time).last
65
-
66
-
if earliest_oldest
67
-
earliest_oldest.time
68
-
elsif self.imports.merge(Import.unfinished).exists?
69
nil
70
else
71
-
:end
72
end
73
end
74
+5
-1
app/post_downloader.rb
+5
-1
app/post_downloader.rb
+2
-2
app/server.rb
+2
-2
app/server.rb
···
8
9
class Server < Sinatra::Application
10
register Sinatra::ActiveRecordExtension
11
-
set :port, 3000
12
13
PAGE_LIMIT = 25
14
HOSTNAME = ENV['SERVER_HOSTNAME'] || 'lycan.feeds.blue'
···
162
else
163
json_response(status: 'not_started')
164
end
165
-
when :end
166
json_response(status: 'finished')
167
else
168
progress = 1 - (until_date - user.registered_at) / (Time.now - user.registered_at)
···
8
9
class Server < Sinatra::Application
10
register Sinatra::ActiveRecordExtension
11
+
set :port, ENV['PORT'] || 3000
12
13
PAGE_LIMIT = 25
14
HOSTNAME = ENV['SERVER_HOSTNAME'] || 'lycan.feeds.blue'
···
162
else
163
json_response(status: 'not_started')
164
end
165
+
when Import::IMPORT_END
166
json_response(status: 'finished')
167
else
168
progress = 1 - (until_date - user.registered_at) / (Time.now - user.registered_at)
+5
db/migrate/20251027134657_add_fetched_until_to_imports.rb
+5
db/migrate/20251027134657_add_fetched_until_to_imports.rb
+2
-1
db/schema.rb
+2
-1
db/schema.rb
···
10
#
11
# It's strongly recommended that you check this file into your version control system.
12
13
-
ActiveRecord::Schema[7.2].define(version: 2025_09_23_180153) do
14
# These are extensions that must be enabled in order to support this database
15
enable_extension "plpgsql"
16
···
25
t.datetime "started_from"
26
t.datetime "last_completed"
27
t.string "collection", limit: 20, null: false
28
t.index ["user_id", "collection"], name: "index_imports_on_user_id_and_collection", unique: true
29
end
30
···
10
#
11
# It's strongly recommended that you check this file into your version control system.
12
13
+
ActiveRecord::Schema[7.2].define(version: 2025_10_27_134657) do
14
# These are extensions that must be enabled in order to support this database
15
enable_extension "plpgsql"
16
···
25
t.datetime "started_from"
26
t.datetime "last_completed"
27
t.string "collection", limit: 20, null: false
28
+
t.datetime "fetched_until"
29
t.index ["user_id", "collection"], name: "index_imports_on_user_id_and_collection", unique: true
30
end
31