rules/developing_checks.md at main · skywatch.blue/skywatch-automod

A tool for parsing traffic on the jetstream and applying a moderation workstream based on regexp based rules
skywatch-automod / rules / developing_checks.md
at main 8.4 kB view raw view rendered
  1# Developing Moderation Checks
  2
  3This guide explains how to configure moderation rules for skywatch-automod.
  4
  5## Overview
  6
  7Moderation checks are defined in TypeScript files in the `rules/` directory. Each check uses regular expressions to match content and specifies what action to take when a match is found.
  8
  9## Check Types
 10
 11### Post Content Checks
 12
 13File: `rules/posts.ts`
 14
 15Monitors post text and embedded URLs for matches.
 16
 17```typescript
 18import type { Checks } from "../src/types.js";
 19
 20export const POST_CHECKS: Checks[] = [
 21  {
 22    label: "spam",
 23    comment: "Spam content detected in post",
 24    reportAcct: false,
 25    commentAcct: false,
 26    toLabel: true,
 27    check: new RegExp("buy.*followers", "i"),
 28  },
 29];
 30```
 31
 32### Handle Checks
 33
 34File: `rules/handles.ts`
 35
 36Monitors user handles for pattern matches.
 37
 38```typescript
 39export const HANDLE_CHECKS: Checks[] = [
 40  {
 41    label: "impersonation",
 42    comment: "Potential impersonation detected",
 43    reportAcct: true,
 44    commentAcct: false,
 45    toLabel: false,
 46    check: new RegExp("official.*support", "i"),
 47  },
 48];
 49```
 50
 51### Profile Checks
 52
 53File: `rules/profiles.ts`
 54
 55Monitors profile display names and descriptions.
 56
 57```typescript
 58export const PROFILE_CHECKS: Checks[] = [
 59  {
 60    label: "spam-profile",
 61    comment: "Spam content in profile",
 62    reportAcct: false,
 63    commentAcct: false,
 64    toLabel: true,
 65    displayName: true,  // Check display name
 66    description: true,  // Check description
 67    check: new RegExp("follow.*back", "i"),
 68  },
 69];
 70```
 71
 72### Account Age Checks
 73
 74File: `rules/accountAge.ts`
 75
 76Labels accounts created after a specific date when they interact with monitored content.
 77
 78```typescript
 79import type { AccountAgeCheck } from "../src/types.js";
 80
 81export const ACCOUNT_AGE_CHECKS: AccountAgeCheck[] = [
 82  {
 83    monitoredDIDs: ["did:plc:abc123"],
 84    anchorDate: "2025-01-15",
 85    maxAgeDays: 7,
 86    label: "new-account-spam",
 87    comment: "New account replying to monitored user",
 88    expires: "2025-02-15",  // Optional expiration
 89  },
 90];
 91```
 92
 93### Account Threshold Checks
 94
 95File: `rules/accountThreshold.ts`
 96
 97Applies account-level labels when an account accumulates multiple post-level violations within a time window.
 98
 99```typescript
100import type { AccountThresholdConfig } from "../src/types.js";
101
102export const ACCOUNT_THRESHOLD_CONFIGS: AccountThresholdConfig[] = [
103  {
104    labels: ["spam", "scam"],  // Trigger on either label
105    threshold: 3,
106    accountLabel: "repeat-offender",
107    accountComment: "Account exceeded spam threshold",
108    window: 7,
109    windowUnit: "days",  // Options: "minutes", "hours", "days"
110    reportAcct: true,
111    commentAcct: false,
112    toLabel: true,
113  },
114];
115```
116
117### Starter Pack Threshold Checks
118
119File: `rules/starterPackThreshold.ts`
120
121Applies account-level labels when an account creates too many starter packs within a time window. Useful for detecting follow-farming and coordinated campaign behaviour.
122
123```typescript
124import type { StarterPackThresholdConfig } from "../src/types.js";
125
126export const STARTER_PACK_THRESHOLD_CONFIGS: StarterPackThresholdConfig[] = [
127  {
128    threshold: 10,           // Account action triggered after 10 starter packs
129    window: 7,               // Within this duration
130    windowUnit: "days",      // Options: "minutes", "hours", "days"
131    accountLabel: "follow-farming",
132    accountComment: "Account created multiple starter packs in short period",
133    toLabel: true,           // Whether to apply the label (default: true)
134    reportAcct: true,        // Whether to report the account
135    commentAcct: false,      // Whether to comment on the account
136    allowlist: [],           // DIDs to exempt from this check
137  },
138];
139```
140
141## Check Configuration Fields
142
143### Basic Fields (Required)
144
145- `label` - Label to apply (string)
146- `comment` - Comment for the moderation action (string)
147- `reportAcct` - Create account report (boolean)
148- `commentAcct` - Add comment to account (boolean)
149- `toLabel` - Apply the label (boolean)
150- `check` - Regular expression pattern (RegExp)
151
152### Optional Fields
153
154- `language` - Language codes to restrict check to (string[])
155- `description` - Check profile descriptions (boolean)
156- `displayName` - Check profile display names (boolean)
157- `reportPost` - Create post report instead of just labeling (boolean)
158- `duration` - Label duration in hours (number)
159- `whitelist` - RegExp to exclude from matching (RegExp)
160- `ignoredDIDs` - DIDs to skip checking (string[])
161- `starterPacks` - Filter by starter pack membership (string[])
162- `knownVectors` - Known attack vectors for tracking (string[])
163- `trackOnly` - Track without applying label (boolean)
164- `unlabel` - Remove existing label if content no longer matches (boolean)
165
166### Threshold Configuration Fields
167
168#### Account Threshold
169
170- `labels` - Single label or array of labels to aggregate (string | string[])
171- `threshold` - Number of labeled posts required to trigger account action (number)
172- `window` - Rolling window duration (number)
173- `windowUnit` - Unit for the rolling window: "minutes", "hours", or "days" (WindowUnit)
174- `accountLabel` - Label to apply to the account (string)
175- `accountComment` - Comment for the account action (string)
176- `toLabel` - Whether to apply the label, defaults to true (boolean)
177- `reportAcct` - Whether to report the account (boolean)
178- `commentAcct` - Whether to comment on the account (boolean)
179
180#### Starter Pack Threshold
181
182- `threshold` - Number of starter packs required to trigger account action (number)
183- `window` - Rolling window duration (number)
184- `windowUnit` - Unit for the rolling window: "minutes", "hours", or "days" (WindowUnit)
185- `accountLabel` - Label to apply to the account (string)
186- `accountComment` - Comment for the account action (string)
187- `toLabel` - Whether to apply the label, defaults to true (boolean)
188- `reportAcct` - Whether to report the account (boolean)
189- `commentAcct` - Whether to comment on the account (boolean)
190- `allowlist` - DIDs to exempt from this check (string[])
191
192## Examples
193
194### Language-Specific Check
195
196```typescript
197{
198  language: ["spa"],
199  label: "spam-es",
200  comment: "Spanish spam detected",
201  reportAcct: false,
202  commentAcct: false,
203  toLabel: true,
204  check: new RegExp("comprar seguidores", "i"),
205}
206```
207
208### Temporary Label
209
210```typescript
211{
212  label: "review-needed",
213  comment: "Content flagged for review",
214  reportAcct: true,
215  commentAcct: false,
216  toLabel: false,
217  duration: 24,  // Label expires after 24 hours
218  check: new RegExp("suspicious.*pattern", "i"),
219}
220```
221
222### Whitelist Exception
223
224```typescript
225{
226  label: "blocked-term",
227  comment: "Blocked term used",
228  reportAcct: false,
229  commentAcct: false,
230  toLabel: true,
231  check: new RegExp("\\bterm\\b", "i"),
232  whitelist: new RegExp("legitimate.*context", "i"),
233}
234```
235
236### Ignored DIDs
237
238```typescript
239{
240  label: "blocked-term",
241  comment: "Blocked term used",
242  reportAcct: false,
243  commentAcct: false,
244  toLabel: true,
245  check: new RegExp("\\bterm\\b", "i"),
246  ignoredDIDs: [
247    "did:plc:trusted123",
248    "did:plc:verified456",
249  ],
250}
251```
252
253## Global Configuration
254
255### Allowlist
256
257File: `rules/constants.ts`
258
259DIDs in the global allowlist bypass all checks.
260
261```typescript
262export const GLOBAL_ALLOW: string[] = [
263  "did:plc:trusted123",
264  "did:plc:verified456",
265];
266```
267
268### Link Shorteners
269
270Pattern to match URL shorteners for special handling.
271
272```typescript
273export const LINK_SHORTENER = new RegExp(
274  "bit\\.ly|tinyurl\\.com|goo\\.gl",
275  "i"
276);
277```
278
279## Best Practices
280
281### Regular Expressions
282
283- Use word boundaries (`\\b`) to avoid partial matches
284- Test patterns thoroughly to minimize false positives
285- Use case-insensitive matching (`i` flag) when appropriate
286- Escape special regex characters
287
288### Action Selection
289
290- `toLabel: true` - Apply label immediately (use for clear violations)
291- `reportAcct: true` - Create report for manual review (use for ambiguous cases)
292- `commentAcct: true` - Create comment on account (probably can be depreciated)
293
294### Performance
295
296- Keep regex patterns simple and efficient
297- Use language filters to reduce unnecessary checks
298- Leverage whitelists instead of complex negative lookaheads
299
300### Testing
301
302After modifying rules:
303
304```bash
305bun test:run
306```
307
308Test specific rule modules:
309
310```bash
311bun test src/rules/posts/tests/
312```
313
314## Deployment
315
316Rules are mounted as a volume in docker compose:
317
318```yaml
319volumes:
320  - ./rules:/app/rules
321```
322
323Changes require automod rebuild:
324
325```bash
326docker compose up -d --build automod
327```