Skyfall#
🌤 A Ruby gem for streaming data from the Bluesky/AtProto firehose 🦋
What does it do#
Skyfall is a Ruby library for connecting to the "firehose" of the Bluesky social network, i.e. a websocket which streams all new posts and everything else happening on the Bluesky network in real time. The code connects to the websocket endpoint, decodes the messages which are encoded in some binary formats like DAG-CBOR, and returns the data as Ruby objects, which you can filter and save to some kind of database (e.g. in order to create a custom feed).
Installation#
gem install skyfall
Usage#
Start a connection to the firehose by creating a Skyfall::Stream object, passing the server hostname and endpoint name:
require 'skyfall'
sky = Skyfall::Stream.new('bsky.social', :subscribe_repos)
Add event listeners to handle incoming messages and get notified of errors:
sky.on_connect { puts "Connected" }
sky.on_disconnect { puts "Disconnected" }
sky.on_message { |m| p m }
sky.on_error { |e| puts "ERROR: #{e}" }
When you're ready, open the connection by calling connect:
sky.connect
Processing messages#
Each message passed to on_message is an instance of the WebsocketMessage class and has such properties:
type(symbol) - usually:commitseq(sequential number)time(Time)repo(string) - DID of the repository (user account)commit- CID of the commitprev- CID of the previous commit in that repooperations- list of operations (usually one)
Operations are objects of type Operation and have such properties:
repo(string) - DID of the repository (user account)collection(string) - name of the relevant collection in the repository, e.g.app.bsky.feed.postfor postspath(string) - the path part of the at:// URI - collection name + ID (rkey) of the itemaction(symbol) -:create,:updateor:deleteuri(string) - the at:// URItype(symbol) - short name of the collection, e.g.:bsky_postcid- CID of the operation/record (nilfor delete operations)
Create and update operations will also have an attached record (JSON object) with details of the post, like etc. The record data is currently available as a Ruby hash via raw_record property (custom types will be added in a later version).
So for example, in order to filter only "create post" operations and print their details, you can do something like this:
sky.on_message do |m|
next if m.type != :commit
m.operations.each do |op|
next unless op.action == :create && op.type == :bsky_post
puts "#{op.repo}:"
puts op.raw_record['text']
puts
end
end
See complete example in example/firehose.rb.
Credits#
Copyright © 2023 Kuba Suder (@mackuba.eu).
The code is available under the terms of the zlib license (permissive, similar to MIT).
Bug reports and pull requests are welcome 😎