# Trap Traverse records received from a Tap service and dump into a PostgreSQL database. ## Example Usage In this example we'll *tap into* everything in the "sh.tangled.*" NSID starting from the @tangled.org AT repo. ### 1. Setup a PostgreSQL cluster and create a database ```bash ... ``` Let's assume you've created a DB called `trap_tangled`. ### 2. Tap ```bash TAP_COLLECTION_FILTERS="sh.tangled.*" TAP_BIND=127.0.0.1:2480 tap run ``` Trap will collect *any* records the Tap service sends. Control the records you want to collect with the `TAP_COLLECTION_FILTERS` variable. ### 3. Trap Run `trap`, seeding from the DID of @tangled.org: ```bash RUST_LOG=debug,sqlx=warn TRAP_DATABASE_URL=postgresql:///trap_tangled trap --seed did:plc:wshs7t2adsemcrrd4snkeqli ``` Trap will submit the seed DID to the Tap service. Each record returned by Tap is scanned, and any DIDs found will also be added to the Tap service. ### 4. Wait... *Eventually*, and I mean *eventually*, you'll end up with a table named `record` filled with every "sh.tangled.*" record reachable from the @tangled.org repo. ### 5. Perform *Data Science* Time to jump into psql! The `record_by_collection` view counts how many records have been indexed for each collection: ``` trap_tangled=# select * from record_by_collection ; collection | count -------------------------------+------- sh.tangled.feed.star | 6572 sh.tangled.graph.follow | 5530 sh.tangled.spindle.member | 4982 sh.tangled.knot.member | 3798 sh.tangled.repo | 3617 sh.tangled.repo.pull | 2006 sh.tangled.publicKey | 1960 sh.tangled.repo.issue | 1698 sh.tangled.repo.issue.comment | 1626 sh.tangled.repo.pull.comment | 1208 sh.tangled.actor.profile | 1178 sh.tangled.label.op | 691 sh.tangled.feed.reaction | 600 sh.tangled.string | 498 sh.tangled.repo.issue.state | 345 sh.tangled.knot | 251 sh.tangled.repo.collaborator | 244 sh.tangled.label.definition | 133 sh.tangled.repo.artifact | 72 sh.tangled.spindle | 71 (20 rows) trap_tangled=# ``` Analyse SSH public-key statistics: ``` trap_tangled=# SELECT split_part(data->>'key', ' ', 1) AS key_type, count(*) AS count FROM record WHERE collection = 'sh.tangled.publicKey' GROUP BY (split_part(data->>'key', ' ', 1)) ORDER BY (count(*)) DESC; key_type | count ------------------------------------+------- ssh-ed25519 | 1528 ssh-rsa | 348 sk-ssh-ed25519@openssh.com | 49 ecdsa-sha2-nistp256 | 28 ecdsa-sha2-nistp521 | 4 sh-ed25519 | 2 sk-ecdsa-sha2-nistp256@openssh.com | 1 (7 rows) trap_tangled=# ``` Fascinating!