+70
-68
README.md
+70
-68
README.md
···
4
4
5
5
## Example Usage
6
6
7
-
In this example we'll *tap into* (😉) everything in the "sh.tangled.*" NSID starting from the @tangled.org repo (ATproto repo, not git repo).
7
+
In this example we'll *tap into* everything in the "sh.tangled.*" NSID starting from the @tangled.org AT repo.
8
8
9
-
1. Setup a PostgreSQL cluster and create a database
9
+
### 1. Setup a PostgreSQL cluster and create a database
10
10
11
-
...
11
+
```bash
12
+
...
13
+
```
12
14
13
-
Let's assume you've created a DB called `trap_tangled`.
15
+
Let's assume you've created a DB called `trap_tangled`.
14
16
15
-
2. Tap
17
+
### 2. Tap
16
18
17
-
```bash
18
-
TAP_COLLECTION_FILTERS="sh.tangled.*" TAP_BIND=127.0.0.1:2480 tap run
19
-
```
19
+
```bash
20
+
TAP_COLLECTION_FILTERS="sh.tangled.*" TAP_BIND=127.0.0.1:2480 tap run
21
+
```
20
22
21
-
`trap` will collect *any* records the Tap service sends. You can control this with the `TAP_COLLECTION_FILTERS` variable.
23
+
Trap will collect *any* records the Tap service sends. Control the records you want to collect with the `TAP_COLLECTION_FILTERS` variable.
22
24
23
-
3. Trap
25
+
### 3. Trap
24
26
25
-
Run `trap`, seeding from the DID of @tangled.org:
27
+
Run `trap`, seeding from the DID of @tangled.org:
26
28
27
-
```bash
28
-
RUST_LOG=debug,sqlx=warn INDEX_DATABASE_URL=postgresql:///trap_tangled trap --seed did:plc:wshs7t2adsemcrrd4snkeqli
29
-
```
29
+
```bash
30
+
RUST_LOG=debug,sqlx=warn TRAP_DATABASE_URL=postgresql:///trap_tangled trap --seed did:plc:wshs7t2adsemcrrd4snkeqli
31
+
```
30
32
31
-
`trap` will submit the seed DIDs to the Tap service. Each record return by Tap will be scanned, and any DIDs found will also be added to the Tap service.
33
+
Trap will submit the seed DID to the Tap service. Each record return by Tap will be scanned, and any DIDs found will also be added to the Tap service.
32
34
33
-
4. Wait.
35
+
### 4. Wait...
34
36
35
-
*Eventually*, and I mean *eventually*, you'll end up with a table named `record` filled with every "sh.tangled.*" record reachable from the @tangled.org repo.
37
+
*Eventually*, and I mean *eventually*, you'll end up with a table named `record` filled with every "sh.tangled.*" record reachable from the @tangled.org repo.
36
38
37
-
5. Perform *Data Science*
39
+
### 5. Perform *Data Science*
38
40
39
-
Time to jump into `psql`!
41
+
Time to jump into psql!
40
42
41
-
The `record_by_collection` view counts how many records have been indexed for each collection.
43
+
The `record_by_collection` view counts how many records have been indexed for each collection:
42
44
43
-
```
44
-
trap_tangled=# select * from record_by_collection ;
45
-
collection | count
46
-
-------------------------------+-------
47
-
sh.tangled.feed.star | 5350
48
-
sh.tangled.spindle.member | 4821
49
-
sh.tangled.graph.follow | 4425
50
-
sh.tangled.knot.member | 3607
51
-
sh.tangled.repo | 2618
52
-
sh.tangled.repo.pull | 1785
53
-
sh.tangled.repo.issue | 1390
54
-
sh.tangled.repo.issue.comment | 1386
55
-
sh.tangled.publicKey | 1298
56
-
sh.tangled.repo.pull.comment | 1127
57
-
sh.tangled.actor.profile | 713
58
-
sh.tangled.label.op | 628
59
-
sh.tangled.feed.reaction | 479
60
-
sh.tangled.string | 364
61
-
sh.tangled.repo.issue.state | 320
62
-
sh.tangled.knot | 158
63
-
sh.tangled.repo.collaborator | 146
64
-
sh.tangled.label.definition | 106
65
-
sh.tangled.repo.artifact | 69
66
-
sh.tangled.spindle | 51
67
-
(20 rows)
45
+
```
46
+
trap_tangled=# select * from record_by_collection ;
47
+
collection | count
48
+
-------------------------------+-------
49
+
sh.tangled.feed.star | 5350
50
+
sh.tangled.spindle.member | 4821
51
+
sh.tangled.graph.follow | 4425
52
+
sh.tangled.knot.member | 3607
53
+
sh.tangled.repo | 2618
54
+
sh.tangled.repo.pull | 1785
55
+
sh.tangled.repo.issue | 1390
56
+
sh.tangled.repo.issue.comment | 1386
57
+
sh.tangled.publicKey | 1298
58
+
sh.tangled.repo.pull.comment | 1127
59
+
sh.tangled.actor.profile | 713
60
+
sh.tangled.label.op | 628
61
+
sh.tangled.feed.reaction | 479
62
+
sh.tangled.string | 364
63
+
sh.tangled.repo.issue.state | 320
64
+
sh.tangled.knot | 158
65
+
sh.tangled.repo.collaborator | 146
66
+
sh.tangled.label.definition | 106
67
+
sh.tangled.repo.artifact | 69
68
+
sh.tangled.spindle | 51
69
+
(20 rows)
68
70
69
-
trap_tangled=#
70
-
```
71
+
trap_tangled=#
72
+
```
71
73
72
-
Analyse SSH public-key statistics:
74
+
Analyse SSH public-key statistics:
73
75
74
-
```
75
-
trap_tangled=# SELECT split_part(data->>'key', ' ', 1) AS key_type,
76
-
count(*) AS count
77
-
FROM record
78
-
WHERE collection = 'sh.tangled.publicKey'
79
-
GROUP BY (split_part(data->>'key', ' ', 1))
80
-
ORDER BY (count(*)) DESC;
81
-
key_type | count
82
-
------------------------------------+-------
83
-
ssh-ed25519 | 989
84
-
ssh-rsa | 239
85
-
sk-ssh-ed25519@openssh.com | 44
86
-
ecdsa-sha2-nistp256 | 22
87
-
sh-ed25519 | 2
88
-
sk-ecdsa-sha2-nistp256@openssh.com | 1
89
-
ecdsa-sha2-nistp521 | 1
90
-
(7 rows)
76
+
```
77
+
trap_tangled=# SELECT split_part(data->>'key', ' ', 1) AS key_type,
78
+
count(*) AS count
79
+
FROM record
80
+
WHERE collection = 'sh.tangled.publicKey'
81
+
GROUP BY (split_part(data->>'key', ' ', 1))
82
+
ORDER BY (count(*)) DESC;
83
+
key_type | count
84
+
------------------------------------+-------
85
+
ssh-ed25519 | 989
86
+
ssh-rsa | 239
87
+
sk-ssh-ed25519@openssh.com | 44
88
+
ecdsa-sha2-nistp256 | 22
89
+
sh-ed25519 | 2
90
+
sk-ecdsa-sha2-nistp256@openssh.com | 1
91
+
ecdsa-sha2-nistp521 | 1
92
+
(7 rows)
91
93
92
-
trap_tangled=#
93
-
```
94
+
trap_tangled=#
95
+
```
94
96
95
-
Fascinating!
97
+
Fascinating!
96
98
97
99
## Future work
98
100