personal activity index (bluesky, leaflet, substack)
pai.desertthunder.dev
rss
bluesky
1<!-- markdownlint-disable MD033 -->
2
3# Personal Activity Index
4
5A CLI that ingests content from Substack, Bluesky, Leaflet, and BearBlog into SQLite, with an optional Cloudflare Worker + D1 deployment path.
6
7## Features
8
9- Fetch posts from multiple sources:
10 - **Substack** via RSS feeds
11 - **Bluesky** via AT Protocol
12 - **Leaflet** publications via RSS feeds
13 - **BearBlog** publications via RSS feeds
14- Local SQLite storage with full-text search
15- Flexible filtering and querying via `pai list` / `pai export`
16- Self-hostable HTTP API (`pai serve` exposes `/api/feed`, `/api/item/{id}`, and `/status`)
17- Cloudflare Worker deployment path (D1) for serverless setups
18
19## Quick Start
20
21```bash
22# Install
23cargo install --path cli
24
25# Initialize config (creates ~/.config/pai/config.toml)
26pai init
27
28# Edit config with your sources
29$EDITOR ~/.config/pai/config.toml
30
31# Sync content
32pai sync
33
34# List items
35pai list -n 10
36
37# Check database
38pai db-check
39
40# Install the manpage so `man pai` works
41pai man --install
42
43# Generate manpage to a file
44pai man -o pai.1
45```
46
47<details>
48<summary>For server mode, run the built-in HTTP server against your SQLite database:</summary>
49
50<br>
51
52```bash
53pai serve -d /var/lib/pai/pai.db -a 127.0.0.1:8080
54```
55
56Endpoints:
57
58- `GET /api/feed` – list newest items (supports `source_kind`, `source_id`, `limit`, `since`, `q`)
59- `GET /api/item/{id}` – fetch a single item
60- `GET /status` – health/status summary (total items, counts per source)
61
62For reverse-proxy examples (nginx, Caddy, Docker), see [DEPLOYMENT.md](./DEPLOYMENT.md).
63
64</details>
65
66## Configuration
67
68Configuration is loaded from `$XDG_CONFIG_HOME/pai/config.toml` or `$HOME/.config/pai/config.toml`.
69
70See [config.example.toml](./config.example.toml) for a complete example with all available options.
71
72<details>
73<summary>
74CORS Configuration
75</summary>
76
77Both the HTTP server and Cloudflare Worker support CORS configuration to allow cross-origin requests from your web applications.
78
79### HTTP Server (config.toml)
80
81 Add a `[cors]` section to your config file:
82
83 ```toml
84 [cors]
85 allowed_origins = ["https://desertthunder.dev", "http://localhost:4321"]
86 dev_key = "your-secret-dev-key"
87 ```
88
89 Configuration options:
90
91- **allowed_origins**: List of allowed origins. Supports:
92 - Exact match: `http://localhost:4321` only allows that exact origin
93 - Same-root-domain: `https://desertthunder.dev` also allows `https://pai.desertthunder.dev`, `https://api.desertthunder.dev`, etc.
94- **dev_key**: Optional development key for local testing.
95 When set, requests with the `X-Local-Dev-Key` header matching this value are allowed regardless of origin.
96
97### Cloudflare Worker (Environment Variables)
98
99 Configure CORS via environment variables in `wrangler.toml`:
100
101 ```toml
102 [vars]
103 CORS_ALLOWED_ORIGINS = "https://desertthunder.dev,http://localhost:4321"
104 CORS_DEV_KEY = "your-secret-dev-key"
105 ```
106
107- **CORS_ALLOWED_ORIGINS**: Comma-separated list of allowed origins
108- **CORS_DEV_KEY**: Optional development key (same behavior as HTTP server)
109
110#### Local Development with X-LOCAL-DEV-KEY
111
112For local development from Astro or other frameworks:
113
1141. Add a `dev_key` to your CORS config:
115
116 ```toml
117 [cors]
118 allowed_origins = ["http://localhost:4321"]
119 dev_key = "local-dev-secret-123"
120 ```
121
1222. Include the header in your API requests:
123
124 ```javascript
125 fetch('http://localhost:8080/api/feed', {
126 headers: {
127 'X-Local-Dev-Key': 'local-dev-secret-123'
128 }
129 })
130 ```
131
132 The dev key header bypasses origin checking, useful for testing from different local ports or during development.
133
134#### Same-Root-Domain Support
135
136 If you configure `allowed_origins = ["https://desertthunder.dev"]`, requests from:
137
138- `https://desertthunder.dev` ✓ (exact match)
139- `https://pai.desertthunder.dev` ✓ (subdomain of allowed root)
140- `https://api.desertthunder.dev` ✓ (subdomain of allowed root)
141- `https://evil.dev` ✗ (different root domain)
142
143 This allows you to deploy the API at `pai.desertthunder.dev` and access it from your main site at `desertthunder.dev` without explicitly listing every subdomain.
144
145</details>
146
147## Documentation
148
149- CLI synopsis: `pai -h`, `pai <command> -h`, or `pai man` for the generated `pai(1)` page.
150- `pai man --install [--install-dir DIR]` copies `pai.1` into a MANPATH directory (defaults to `~/.local/share/man/man1`)
151- Database schema and config reference: [config.example.toml](./config.example.toml).
152- Deployment topologies: [DEPLOYMENT.md](./DEPLOYMENT.md).
153
154## Architecture
155
156The project is organized as a Cargo workspace
157
158```sh
159.
160├── core # Shared types, fetchers, and the storage trait
161├── cli # CLI binary (POSIX-compliant)
162└── worker # Cloudflare Worker deployment using workers-rs
163```
164
165<details>
166<summary><strong>Source Implementations</strong></summary>
167
168### Substack (RSS)
169
170Substack fetcher uses standard RSS 2.0 feeds available at `{base_url}/feed`.
171
172**Implementation:**
173
174- Fetches RSS feed using `feed-rs` parser
175- Maps RSS `<item>` elements to standardized `Item` struct
176- Uses GUID as item ID, falls back to link if GUID is missing
177- Normalizes `pubDate` to ISO 8601 format
178
179**Key mappings:**
180
181- `id` = RSS GUID or link
182- `source_kind` = `substack`
183- `source_id` = Domain extracted from base_url
184- `title` = RSS title
185- `summary` = RSS description
186- `url` = RSS link
187- `content_html` = RSS content (if available)
188- `published_at` = RSS pubDate (normalized to ISO 8601)
189
190**Example RSS structure:**
191
192```xml
193<item>
194 <title>Post Title</title>
195 <link>https://example.substack.com/p/post-slug</link>
196 <guid>https://example.substack.com/p/post-slug</guid>
197 <pubDate>Mon, 01 Jan 2024 12:00:00 +0000</pubDate>
198 <description>Post summary or excerpt</description>
199</item>
200```
201
202### AT Protocol Integration (Bluesky)
203
204#### Overview
205
206Bluesky is built on the AT Protocol (Authenticated Transfer Protocol), a decentralized social networking protocol.
207
208**Key Concepts:**
209
210- **DID (Decentralized Identifier)**: Unique identifier for users (e.g., `did:plc:xyz123`)
211- **Handle**: Human-readable identifier (e.g., `user.bsky.social`)
212- **AT URI**: Resource identifier (e.g., `at://did:plc:xyz/app.bsky.feed.post/abc123`)
213- **Lexicon**: Schema definition language for records and API methods
214- **XRPC**: HTTP API wrapper for AT Protocol methods
215- **PDS (Personal Data Server)**: Server that stores user data
216
217#### Implementation
218
219Bluesky uses standard `app.bsky.feed.post` records and provides a public API for fetching posts.
220
221**Endpoint:** `GET https://public.api.bsky.app/xrpc/app.bsky.feed.getAuthorFeed`
222
223**Parameters:**
224
225- `actor` - User handle or DID
226- `limit` - Number of posts to fetch (default: 50)
227- `cursor` - Pagination cursor (optional)
228
229**Implementation:**
230
231- Fetches author feed using `app.bsky.feed.getAuthorFeed`
232- Filters out reposts and quotes (only includes original posts)
233- Converts AT URIs to canonical Bluesky URLs
234- Truncates long post text to create titles
235
236**Key mappings:**
237
238- `id` = AT URI (e.g., `at://did:plc:xyz/app.bsky.feed.post/abc123`)
239- `source_kind` = `bluesky`
240- `source_id` = User handle
241- `title` = Truncated post text (first 100 chars)
242- `summary` = Full post text
243- `url` = Canonical URL (`https://bsky.app/profile/{handle}/post/{post_id}`)
244- `author` = Post author handle
245- `published_at` = Post `createdAt` timestamp
246
247**Filtering reposts:**
248Posts with a `reason` field (indicating repost or quote) are excluded to fetch only original content.
249
250### Leaflet (RSS)
251
252#### Overview
253
254Leaflet publications provide RSS feeds at `{base_url}/rss`, making them straightforward to fetch using standard RSS parsing.
255
256**Note:** While Leaflet is built on AT Protocol and uses custom `pub.leaflet.post` records, we use RSS feeds for simplicity and reliability. Leaflet's RSS implementation provides all necessary metadata without requiring AT Protocol PDS queries.
257
258**Implementation:**
259
260- Fetches RSS feed using `feed-rs` parser
261- Maps RSS `<item>` elements to standardized `Item` struct
262- Supports multiple publications via config array
263- Uses entry ID from feed, falls back to link if missing
264- Normalizes publication dates to ISO 8601 format
265
266**Key mappings:**
267
268- `id` = RSS entry ID or link
269- `source_kind` = `leaflet`
270- `source_id` = Publication ID from config (e.g., `desertthunder`, `stormlightlabs`)
271- `title` = RSS entry title
272- `summary` = RSS entry summary/description
273- `url` = RSS entry link
274- `content_html` = RSS content body (if available)
275- `author` = RSS entry author
276- `published_at` = RSS published date or updated date (normalized to ISO 8601)
277
278**Configuration:**
279
280Leaflet supports multiple publications through array configuration:
281
282```toml
283[[sources.leaflet]]
284enabled = true
285id = "desertthunder"
286base_url = "https://desertthunder.leaflet.pub"
287
288[[sources.leaflet]]
289enabled = true
290id = "stormlightlabs"
291base_url = "https://stormlightlabs.leaflet.pub"
292```
293
294**Example RSS structure:**
295
296```xml
297<item>
298 <title>Dev Log: 2025-11-22</title>
299 <link>https://desertthunder.leaflet.pub/3m6a7fuk7u22p</link>
300 <guid>https://desertthunder.leaflet.pub/3m6a7fuk7u22p</guid>
301 <pubDate>Fri, 22 Nov 2025 16:22:54 +0000</pubDate>
302 <description>Post summary or excerpt</description>
303</item>
304```
305
306### BearBlog (RSS)
307
308#### Overview
309
310BearBlog is a minimalist blogging platform that provides RSS feeds at `{slug}.bearblog.dev/feed/`, making them straightforward to fetch using standard RSS parsing.
311
312**Implementation:**
313
314- Fetches RSS feed using `feed-rs` parser
315- Maps RSS `<item>` elements to standardized `Item` struct
316- Supports multiple blogs via config array
317- Uses entry ID from feed, falls back to link if missing
318- Normalizes publication dates to ISO 8601 format
319
320**Key mappings:**
321
322- `id` = RSS entry ID or link
323- `source_kind` = `bearblog`
324- `source_id` = Blog ID from config (e.g., `desertthunder`)
325- `title` = RSS entry title
326- `summary` = RSS entry summary/description
327- `url` = RSS entry link
328- `content_html` = RSS content body (if available)
329- `author` = RSS entry author
330- `published_at` = RSS published date or updated date (normalized to ISO 8601)
331
332**Configuration:**
333
334BearBlog supports multiple blogs through array configuration:
335
336```toml
337[[sources.bearblog]]
338enabled = true
339id = "desertthunder"
340base_url = "https://desertthunder.bearblog.dev"
341
342[[sources.bearblog]]
343enabled = true
344id = "another-blog"
345base_url = "https://another-blog.bearblog.dev"
346```
347
348**Example RSS structure:**
349
350```xml
351<item>
352 <title>My Blog Post</title>
353 <link>https://desertthunder.bearblog.dev/my-blog-post</link>
354 <guid>https://desertthunder.bearblog.dev/my-blog-post</guid>
355 <pubDate>Fri, 22 Nov 2025 16:22:54 +0000</pubDate>
356 <description>Post summary or excerpt</description>
357</item>
358```
359
360</details>
361
362## References
363
364- [AT Protocol Documentation](https://atproto.com)
365- [Lexicon Guide](https://atproto.com/guides/lexicon) - Schema definition language
366- [XRPC Specification](https://atproto.com/specs/xrpc) - HTTP API wrapper
367- [Bluesky API Documentation](https://docs.bsky.app/)
368- [Leaflet](https://tangled.org/leaflet.pub/leaflet) - Leaflet source code
369- [Leaflet Manual](https://about.leaflet.pub/) - User-facing documentation
370
371## License
372
373See [LICENSE](./LICENSE)