doc: adjust readme

Signed-off-by: tjh <x@tjh.dev>

tjh.dev 5c850e26 fd713192

verified
Changed files
+70 -68
+70 -68
README.md
··· 4 4 5 5 ## Example Usage 6 6 7 - In this example we'll *tap into* (😉) everything in the "sh.tangled.*" NSID starting from the @tangled.org repo (ATproto repo, not git repo). 7 + In this example we'll *tap into* everything in the "sh.tangled.*" NSID starting from the @tangled.org AT repo. 8 8 9 - 1. Setup a PostgreSQL cluster and create a database 9 + ### 1. Setup a PostgreSQL cluster and create a database 10 10 11 - ... 11 + ```bash 12 + ... 13 + ``` 12 14 13 - Let's assume you've created a DB called `trap_tangled`. 15 + Let's assume you've created a DB called `trap_tangled`. 14 16 15 - 2. Tap 17 + ### 2. Tap 16 18 17 - ```bash 18 - TAP_COLLECTION_FILTERS="sh.tangled.*" TAP_BIND=127.0.0.1:2480 tap run 19 - ``` 19 + ```bash 20 + TAP_COLLECTION_FILTERS="sh.tangled.*" TAP_BIND=127.0.0.1:2480 tap run 21 + ``` 20 22 21 - `trap` will collect *any* records the Tap service sends. You can control this with the `TAP_COLLECTION_FILTERS` variable. 23 + Trap will collect *any* records the Tap service sends. Control the records you want to collect with the `TAP_COLLECTION_FILTERS` variable. 22 24 23 - 3. Trap 25 + ### 3. Trap 24 26 25 - Run `trap`, seeding from the DID of @tangled.org: 27 + Run `trap`, seeding from the DID of @tangled.org: 26 28 27 - ```bash 28 - RUST_LOG=debug,sqlx=warn INDEX_DATABASE_URL=postgresql:///trap_tangled trap --seed did:plc:wshs7t2adsemcrrd4snkeqli 29 - ``` 29 + ```bash 30 + RUST_LOG=debug,sqlx=warn TRAP_DATABASE_URL=postgresql:///trap_tangled trap --seed did:plc:wshs7t2adsemcrrd4snkeqli 31 + ``` 30 32 31 - `trap` will submit the seed DIDs to the Tap service. Each record return by Tap will be scanned, and any DIDs found will also be added to the Tap service. 33 + Trap will submit the seed DID to the Tap service. Each record return by Tap will be scanned, and any DIDs found will also be added to the Tap service. 32 34 33 - 4. Wait. 35 + ### 4. Wait... 34 36 35 - *Eventually*, and I mean *eventually*, you'll end up with a table named `record` filled with every "sh.tangled.*" record reachable from the @tangled.org repo. 37 + *Eventually*, and I mean *eventually*, you'll end up with a table named `record` filled with every "sh.tangled.*" record reachable from the @tangled.org repo. 36 38 37 - 5. Perform *Data Science* 39 + ### 5. Perform *Data Science* 38 40 39 - Time to jump into `psql`! 41 + Time to jump into psql! 40 42 41 - The `record_by_collection` view counts how many records have been indexed for each collection. 43 + The `record_by_collection` view counts how many records have been indexed for each collection: 42 44 43 - ``` 44 - trap_tangled=# select * from record_by_collection ; 45 - collection | count 46 - -------------------------------+------- 47 - sh.tangled.feed.star | 5350 48 - sh.tangled.spindle.member | 4821 49 - sh.tangled.graph.follow | 4425 50 - sh.tangled.knot.member | 3607 51 - sh.tangled.repo | 2618 52 - sh.tangled.repo.pull | 1785 53 - sh.tangled.repo.issue | 1390 54 - sh.tangled.repo.issue.comment | 1386 55 - sh.tangled.publicKey | 1298 56 - sh.tangled.repo.pull.comment | 1127 57 - sh.tangled.actor.profile | 713 58 - sh.tangled.label.op | 628 59 - sh.tangled.feed.reaction | 479 60 - sh.tangled.string | 364 61 - sh.tangled.repo.issue.state | 320 62 - sh.tangled.knot | 158 63 - sh.tangled.repo.collaborator | 146 64 - sh.tangled.label.definition | 106 65 - sh.tangled.repo.artifact | 69 66 - sh.tangled.spindle | 51 67 - (20 rows) 45 + ``` 46 + trap_tangled=# select * from record_by_collection ; 47 + collection | count 48 + -------------------------------+------- 49 + sh.tangled.feed.star | 5350 50 + sh.tangled.spindle.member | 4821 51 + sh.tangled.graph.follow | 4425 52 + sh.tangled.knot.member | 3607 53 + sh.tangled.repo | 2618 54 + sh.tangled.repo.pull | 1785 55 + sh.tangled.repo.issue | 1390 56 + sh.tangled.repo.issue.comment | 1386 57 + sh.tangled.publicKey | 1298 58 + sh.tangled.repo.pull.comment | 1127 59 + sh.tangled.actor.profile | 713 60 + sh.tangled.label.op | 628 61 + sh.tangled.feed.reaction | 479 62 + sh.tangled.string | 364 63 + sh.tangled.repo.issue.state | 320 64 + sh.tangled.knot | 158 65 + sh.tangled.repo.collaborator | 146 66 + sh.tangled.label.definition | 106 67 + sh.tangled.repo.artifact | 69 68 + sh.tangled.spindle | 51 69 + (20 rows) 68 70 69 - trap_tangled=# 70 - ``` 71 + trap_tangled=# 72 + ``` 71 73 72 - Analyse SSH public-key statistics: 74 + Analyse SSH public-key statistics: 73 75 74 - ``` 75 - trap_tangled=# SELECT split_part(data->>'key', ' ', 1) AS key_type, 76 - count(*) AS count 77 - FROM record 78 - WHERE collection = 'sh.tangled.publicKey' 79 - GROUP BY (split_part(data->>'key', ' ', 1)) 80 - ORDER BY (count(*)) DESC; 81 - key_type | count 82 - ------------------------------------+------- 83 - ssh-ed25519 | 989 84 - ssh-rsa | 239 85 - sk-ssh-ed25519@openssh.com | 44 86 - ecdsa-sha2-nistp256 | 22 87 - sh-ed25519 | 2 88 - sk-ecdsa-sha2-nistp256@openssh.com | 1 89 - ecdsa-sha2-nistp521 | 1 90 - (7 rows) 76 + ``` 77 + trap_tangled=# SELECT split_part(data->>'key', ' ', 1) AS key_type, 78 + count(*) AS count 79 + FROM record 80 + WHERE collection = 'sh.tangled.publicKey' 81 + GROUP BY (split_part(data->>'key', ' ', 1)) 82 + ORDER BY (count(*)) DESC; 83 + key_type | count 84 + ------------------------------------+------- 85 + ssh-ed25519 | 989 86 + ssh-rsa | 239 87 + sk-ssh-ed25519@openssh.com | 44 88 + ecdsa-sha2-nistp256 | 22 89 + sh-ed25519 | 2 90 + sk-ecdsa-sha2-nistp256@openssh.com | 1 91 + ecdsa-sha2-nistp521 | 1 92 + (7 rows) 91 93 92 - trap_tangled=# 93 - ``` 94 + trap_tangled=# 95 + ``` 94 96 95 - Fascinating! 97 + Fascinating! 96 98 97 99 ## Future work 98 100