1# Trap 2 3Traverse records received from a Tap service and dump into a PostgreSQL database. 4 5## Example Usage 6 7In this example we'll *tap into* everything in the "sh.tangled.*" NSID starting from the @tangled.org AT repo. 8 9### 1. Setup a PostgreSQL cluster and create a database 10 11 ```bash 12 ... 13 ``` 14 15 Let's assume you've created a DB called `trap_tangled`. 16 17### 2. Tap 18 19 ```bash 20 TAP_COLLECTION_FILTERS="sh.tangled.*" TAP_BIND=127.0.0.1:2480 tap run 21 ``` 22 23 Trap will collect *any* records the Tap service sends. Control the records you want to collect with the `TAP_COLLECTION_FILTERS` variable. 24 25### 3. Trap 26 27 Run `trap`, seeding from the DID of @tangled.org: 28 29 ```bash 30 RUST_LOG=debug,sqlx=warn TRAP_DATABASE_URL=postgresql:///trap_tangled trap --seed did:plc:wshs7t2adsemcrrd4snkeqli 31 ``` 32 33 Trap will submit the seed DID to the Tap service. Each record returned by Tap is scanned, and any DIDs found will also be added to the Tap service. 34 35### 4. Wait... 36 37 *Eventually*, and I mean *eventually*, you'll end up with a table named `record` filled with every "sh.tangled.*" record reachable from the @tangled.org repo. 38 39### 5. Perform *Data Science* 40 41 Time to jump into psql! 42 43 The `record_by_collection` view counts how many records have been indexed for each collection: 44 45 ``` 46 trap_tangled=# select * from record_by_collection ; 47 collection | count 48 -------------------------------+------- 49 sh.tangled.feed.star | 6572 50 sh.tangled.graph.follow | 5530 51 sh.tangled.spindle.member | 4982 52 sh.tangled.knot.member | 3798 53 sh.tangled.repo | 3617 54 sh.tangled.repo.pull | 2006 55 sh.tangled.publicKey | 1960 56 sh.tangled.repo.issue | 1698 57 sh.tangled.repo.issue.comment | 1626 58 sh.tangled.repo.pull.comment | 1208 59 sh.tangled.actor.profile | 1178 60 sh.tangled.label.op | 691 61 sh.tangled.feed.reaction | 600 62 sh.tangled.string | 498 63 sh.tangled.repo.issue.state | 345 64 sh.tangled.knot | 251 65 sh.tangled.repo.collaborator | 244 66 sh.tangled.label.definition | 133 67 sh.tangled.repo.artifact | 72 68 sh.tangled.spindle | 71 69 (20 rows) 70 71 trap_tangled=# 72 ``` 73 74 Analyse SSH public-key statistics: 75 76 ``` 77 trap_tangled=# SELECT split_part(data->>'key', ' ', 1) AS key_type, 78 count(*) AS count 79 FROM record 80 WHERE collection = 'sh.tangled.publicKey' 81 GROUP BY (split_part(data->>'key', ' ', 1)) 82 ORDER BY (count(*)) DESC; 83 key_type | count 84 ------------------------------------+------- 85 ssh-ed25519 | 1528 86 ssh-rsa | 348 87 sk-ssh-ed25519@openssh.com | 49 88 ecdsa-sha2-nistp256 | 28 89 ecdsa-sha2-nistp521 | 4 90 sh-ed25519 | 2 91 sk-ecdsa-sha2-nistp256@openssh.com | 1 92 (7 rows) 93 94 trap_tangled=# 95 ``` 96 97 Fascinating! 98