Rust 100.0%
3 1 0

Clone this repository

https://tangled.org/tjh.dev/trap
git@knot.tjh.dev:tjh.dev/trap

For self-hosted knots, clone URLs may differ based on your setup.

README.md

Trap#

Traverse records received from a Tap service and dump into a PostgreSQL database.

Example Usage#

In this example we'll tap into everything in the "sh.tangled.*" NSID starting from the @tangled.org AT repo.

1. Setup a PostgreSQL cluster and create a database#

...

Let's assume you've created a DB called trap_tangled.

2. Tap#

TAP_COLLECTION_FILTERS="sh.tangled.*" TAP_BIND=127.0.0.1:2480 tap run

Trap will collect any records the Tap service sends. Control the records you want to collect with the TAP_COLLECTION_FILTERS variable.

3. Trap#

Run trap, seeding from the DID of @tangled.org:

RUST_LOG=debug,sqlx=warn TRAP_DATABASE_URL=postgresql:///trap_tangled trap --seed did:plc:wshs7t2adsemcrrd4snkeqli

Trap will submit the seed DID to the Tap service. Each record returned by Tap is scanned, and any DIDs found will also be added to the Tap service.

4. Wait...#

Eventually, and I mean eventually, you'll end up with a table named record filled with every "sh.tangled.*" record reachable from the @tangled.org repo.

5. Perform Data Science#

Time to jump into psql!

The record_by_collection view counts how many records have been indexed for each collection:

trap_tangled=# select * from record_by_collection ;
          collection           | count
-------------------------------+-------
 sh.tangled.feed.star          |  6572
 sh.tangled.graph.follow       |  5530
 sh.tangled.spindle.member     |  4982
 sh.tangled.knot.member        |  3798
 sh.tangled.repo               |  3617
 sh.tangled.repo.pull          |  2006
 sh.tangled.publicKey          |  1960
 sh.tangled.repo.issue         |  1698
 sh.tangled.repo.issue.comment |  1626
 sh.tangled.repo.pull.comment  |  1208
 sh.tangled.actor.profile      |  1178
 sh.tangled.label.op           |   691
 sh.tangled.feed.reaction      |   600
 sh.tangled.string             |   498
 sh.tangled.repo.issue.state   |   345
 sh.tangled.knot               |   251
 sh.tangled.repo.collaborator  |   244
 sh.tangled.label.definition   |   133
 sh.tangled.repo.artifact      |    72
 sh.tangled.spindle            |    71
(20 rows)

trap_tangled=#  

Analyse SSH public-key statistics:

trap_tangled=# SELECT split_part(data->>'key', ' ', 1) AS key_type,
    count(*) AS count
   FROM record
  WHERE collection = 'sh.tangled.publicKey'
  GROUP BY (split_part(data->>'key', ' ', 1))
  ORDER BY (count(*)) DESC;
              key_type              | count
------------------------------------+-------
 ssh-ed25519                        |  1528
 ssh-rsa                            |   348
 sk-ssh-ed25519@openssh.com         |    49
 ecdsa-sha2-nistp256                |    28
 ecdsa-sha2-nistp521                |     4
 sh-ed25519                         |     2
 sk-ecdsa-sha2-nistp256@openssh.com |     1
(7 rows)

trap_tangled=#

Fascinating!