Malachite is a tool to import your Last.fm and Spotify listening history to the AT Protocol network using the fm.teal.alpha.feed.play lexicon.
malachite scrobbles importer atproto music
16
fork

Configure Feed

Select the types of activity you want to include in your feed.

REWRITE

+1289 -712
+263 -46
README.md
··· 4 4 5 5 (Also [on Tangled!](https://tangled.org/@did:plc:ofrbh253gwicbkc5nktqepol/atproto-lastfm-importer)) 6 6 7 + ## Features 8 + 9 + - ✅ **Rate Limiting**: Automatically limits imports to 1K records per day to prevent rate limiting your entire PDS 10 + - ✅ **Multi-Day Imports**: Large imports (>1K records) automatically span multiple days with 24-hour pauses 11 + - ✅ **Resume Support**: Safe to stop (Ctrl+C) and restart - continues from where it left off 12 + - ✅ **Graceful Cancellation**: Press Ctrl+C to stop after the current batch completes 13 + - ✅ **Identity Resolution**: Resolves ATProto handles/DIDs using Slingshot 14 + - ✅ **PDS Auto-Discovery**: Automatically connects to your personal PDS 15 + - ✅ **Dry Run Mode**: Preview records without publishing 16 + - ✅ **Batch Processing**: Configurable batching with rate limit safety 17 + - ✅ **Progress Tracking**: Real-time progress with time estimates 18 + - ✅ **Error Handling**: Continues on errors with detailed reporting 19 + - ✅ **MusicBrainz Support**: Preserves MusicBrainz IDs when available 20 + - ✅ **Chronological Ordering**: Processes oldest first (or newest with `-r` flag) 21 + 22 + ## Important: Rate Limits 23 + 24 + ⚠️ **CRITICAL**: Bluesky's AppView has rate limits on PDS instances. Exceeding 10K records per day can rate limit your **ENTIRE PDS**, affecting all users on your instance! 25 + 26 + This importer automatically: 27 + - Limits imports to **1,000 records per day** (90% of safe limit) 28 + - Calculates optimal batch sizes and delays 29 + - Pauses 24 hours between days for large imports 30 + - Shows clear progress and time estimates 31 + 32 + See: [Bluesky Rate Limits Documentation](https://docs.bsky.app/blog/rate-limits-pds-v3) 33 + 7 34 ## Setup 8 35 9 36 ```bash 10 37 npm install 38 + npm run build 11 39 ``` 12 40 13 41 ## Usage 14 42 15 43 ### Interactive Mode 44 + 45 + The simplest way to use the importer - just run it and follow the prompts: 16 46 17 47 ```bash 18 - node importer.js 48 + npm start 19 49 ``` 20 50 21 - ### With Command Line Arguments 51 + ### Command Line Mode 22 52 23 - **Full automation:** 53 + For automation or scripting, provide all parameters via flags: 24 54 25 55 ```bash 26 - node importer.js -f lastfm.csv -i alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y 27 - ``` 56 + # Full automation 57 + npm start -- -f lastfm.csv -i alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y 28 58 29 - **Dry run (preview without publishing):** 59 + # Preview without publishing 60 + npm start -- -f lastfm.csv --dry-run 30 61 31 - ```bash 32 - node importer.js -f lastfm.csv --dry-run 62 + # Custom batch settings (advanced users) 63 + npm start -- -f lastfm.csv -i alice.bsky.social -b 20 -d 3000 64 + 65 + # Process newest tracks first 66 + npm start -- -f lastfm.csv -i alice.bsky.social -r -y 33 67 ``` 34 68 35 - **Custom batch settings:** 69 + ## Command Line Options 70 + 71 + | Option | Short | Description | Default | 72 + |--------|-------|-------------|---------| 73 + | `--help` | `-h` | Show help message | - | 74 + | `--file <path>` | `-f` | Path to Last.fm CSV export file | (prompted) | 75 + | `--identifier <id>` | `-i` | ATProto handle or DID | (prompted) | 76 + | `--password <pass>` | `-p` | ATProto app password | (prompted) | 77 + | `--batch-size <num>` | `-b` | Records per batch | Auto-calculated | 78 + | `--batch-delay <ms>` | `-d` | Delay between batches in ms | 2000 (min: 1000) | 79 + | `--yes` | `-y` | Skip confirmation prompt | false | 80 + | `--dry-run` | `-n` | Preview without publishing | false | 81 + | `--reverse-chronological` | `-r` | Process newest first | false (oldest first) | 36 82 37 - ```bash 38 - node importer.js -f lastfm.csv -i alice.bsky.social -b 20 -d 3000 39 - ``` 83 + ### Batch Settings 84 + 85 + The importer automatically calculates optimal batch settings based on your total record count and rate limits. You generally **don't need** to specify batch settings unless you have specific requirements. 86 + 87 + **Automatic behavior:** 88 + - For imports < 1K records: Uses default settings (10 records/batch, 2s delay) 89 + - For imports > 1K records: Automatically calculates settings to spread across multiple days 40 90 41 - ## Options 91 + **Manual override** (advanced): 92 + - `--batch-size`: Number of records processed per batch (1-50) 93 + - `--batch-delay`: Milliseconds to wait between batches (min: 1000) 42 94 43 - - `-h, --help` - Show help message 44 - - `-f, --file <path>` - Path to Last.fm CSV export file 45 - - `-i, --identifier <id>` - ATProto handle or DID 46 - - `-p, --password <pass>` - ATProto app password 47 - - `-b, --batch-size <num>` - Records per batch (default: 10) 48 - - `-d, --batch-delay <ms>` - Delay between batches in ms (default: 2000) 49 - - `-y, --yes` - Skip confirmation prompt 50 - - `-n, --dry-run` - Preview records without publishing 95 + ⚠️ Lower delays increase speed but risk hitting rate limits. The automatic calculation is recommended. 51 96 52 97 ## Getting Your Last.fm Data 53 98 54 99 1. Go to <https://lastfm.ghan.nl/export/> 55 - 2. Request your data export in CSV 100 + 2. Request your data export in CSV format 56 101 3. Download the CSV file when ready 57 102 4. Use the CSV file path with this script 58 103 59 - ## Features 104 + ## What Gets Imported 105 + 106 + Each Last.fm scrobble becomes an `fm.teal.alpha.feed.play` record with: 60 107 61 - - ✅ Resolves ATProto handles/DIDs using Slingshot 62 - - ✅ Connects to your personal PDS 63 - - ✅ Converts Last.fm scrobbles to `fm.teal.alpha.feed.play` records 64 - - ✅ Follows the official lexicon schema 65 - - ✅ Batch publishing with configurable rate limiting 66 - - ✅ Dry run mode for previewing 67 - - ✅ Progress tracking and error reporting 68 - - ✅ Preserves MusicBrainz IDs when available 108 + ### Required Fields 109 + - **trackName**: The name of the track 110 + - **artists**: Array of artist objects (requires `artistName`, optional `artistMbId`) 111 + - **playedTime**: ISO 8601 timestamp of when you listened 112 + - **submissionClientAgent**: Identifies this importer (`lastfm-importer/v0.0.2`) 113 + - **musicServiceBaseDomain**: Always set to `last.fm` 69 114 70 - ## Record Format 115 + ### Optional Fields (when available) 116 + - **releaseName**: Album/release name 117 + - **releaseMbId**: MusicBrainz release ID 118 + - **recordingMbId**: MusicBrainz recording/track ID 119 + - **originUrl**: Link to the track on Last.fm 71 120 72 - Each scrobble is converted according to the `fm.teal.alpha.feed.play` lexicon: 121 + ### Example Record 73 122 74 123 ```json 75 124 { ··· 86 135 "recordingMbId": "3a390ad3-fe56-45f2-a073-bebc45d6bde1", 87 136 "playedTime": "2025-11-13T23:49:36Z", 88 137 "originUrl": "https://www.last.fm/music/Cjbeards/_/Paint+My+Masterpiece", 89 - "submissionClientAgent": "lastfm-importer/v1.0.0", 138 + "submissionClientAgent": "lastfm-importer/v0.0.2", 90 139 "musicServiceBaseDomain": "last.fm" 91 140 } 92 141 ``` 93 142 94 - ### Required Fields 143 + ## Processing Order 144 + 145 + By default, records are processed **oldest first** (chronological order). This means your earliest scrobbles will appear first in your ATProto feed. 146 + 147 + Use the `--reverse-chronological` or `-r` flag to process **newest first** instead. 148 + 149 + ## Multi-Day Imports 150 + 151 + For imports exceeding 1,000 records (after applying the 90% safety margin), the importer automatically: 152 + 153 + 1. **Calculates a schedule**: Splits your import across multiple days 154 + 2. **Shows the plan**: Displays which records will be imported each day 155 + 3. **Processes Day 1**: Imports the first batch of records 156 + 4. **Pauses 24 hours**: Waits a full day before continuing 157 + 5. **Repeats**: Continues until all records are imported 158 + 159 + **Important notes:** 160 + - You can safely stop (Ctrl+C) and restart the importer 161 + - Progress is preserved - it continues where it left off 162 + - Each day's progress is clearly displayed 163 + - Time estimates account for multi-day duration 164 + 165 + Example output for a 5,000 record import: 166 + ``` 167 + 📊 Rate Limiting Information: 168 + Total records: 5,000 169 + Daily limit: 900 records/day 170 + Estimated duration: 6 days 171 + Batch size: 10 records 172 + Batch delay: 9600.0s 173 + ``` 174 + 175 + ## Dry Run Mode 176 + 177 + Preview what will be imported without actually publishing: 178 + 179 + ```bash 180 + npm start -- -f lastfm.csv --dry-run 181 + ``` 182 + 183 + Dry run shows: 184 + - Total record count 185 + - Rate limiting schedule (if applicable) 186 + - Multi-day import plan (if needed) 187 + - Preview of first 5 records with full details 188 + - MusicBrainz IDs when available 189 + 190 + ## Error Handling 191 + 192 + The importer is designed to be resilient: 193 + 194 + - **Network errors**: Records that fail are logged but don't stop the import 195 + - **Invalid data**: Skipped with error messages 196 + - **Authentication issues**: Clear error messages with suggested fixes 197 + - **Rate limit hits**: Automatic adjustment and retry logic 198 + - **Ctrl+C handling**: Gracefully stops after current batch 199 + 200 + Failed records are logged but don't prevent the rest of your import from completing. 201 + 202 + ## Project Structure 203 + 204 + ``` 205 + atproto-lastfm-importer/ 206 + ├── src/ 207 + │ ├── lib/ 208 + │ │ ├── auth.ts # Authentication & identity resolution 209 + │ │ ├── cli.ts # Command line argument parsing 210 + │ │ ├── csv.ts # CSV parsing & record conversion 211 + │ │ └── publisher.ts # Batch publishing with rate limiting 212 + │ ├── utils/ 213 + │ │ ├── helpers.ts # Utility functions (timing, formatting) 214 + │ │ ├── input.ts # User input handling (prompts, passwords) 215 + │ │ └── rate-limiter.ts # Rate limiting calculations 216 + │ ├── config.ts # Configuration constants 217 + │ └── types.ts # TypeScript type definitions 218 + ├── lexicons/ # fm.teal.alpha lexicon definitions 219 + │ └── fm.teal.alpha/ 220 + │ └── feed/ 221 + │ └── play.json # Play record schema 222 + ├── package.json 223 + ├── tsconfig.json 224 + └── README.md 225 + ``` 95 226 96 - - `trackName` - The name of the track 97 - - `artists` - Array of artist objects with `artistName` (required) and optional `artistMbId` 227 + ## Development 98 228 99 - ### Optional Fields 229 + ```bash 230 + # Type checking 231 + npm run type-check 100 232 101 - - `releaseName` - Album name 102 - - `releaseMbId` - MusicBrainz release ID 103 - - `recordingMbId` - MusicBrainz recording ID 104 - - `playedTime` - ISO 8601 datetime 105 - - `originUrl` - Link to the track 106 - - `submissionClientAgent` - Client identifier 107 - - `musicServiceBaseDomain` - Service domain (e.g., "last.fm") 233 + # Build 234 + npm run build 235 + 236 + # Development mode (rebuild + run) 237 + npm run dev 238 + 239 + # Clean build artifacts 240 + npm run clean 241 + ``` 242 + 243 + ## Technical Details 244 + 245 + ### Authentication 246 + - Uses Slingshot resolver to discover your PDS from your handle/DID 247 + - Requires an ATProto app password (not your main password) 248 + - Automatically configures the agent for your personal PDS 249 + 250 + ### Rate Limiting Algorithm 251 + 1. Calculates safe daily limit (90% of 1K = 900 records/day) 252 + 2. Determines how many days needed for your import 253 + 3. Calculates optimal batch size and delay to spread records evenly 254 + 4. Enforces minimum 1 second delay between batches 255 + 5. Shows clear schedule before starting 256 + 257 + ### Record Processing 258 + 1. Parses CSV using `csv-parse` library 259 + 2. Sorts records chronologically (or reverse if `-r` flag) 260 + 3. Converts Last.fm format to `fm.teal.alpha.feed.play` schema 261 + 4. Validates required fields 262 + 5. Publishes in batches with configurable delays 263 + 264 + ### Data Mapping 265 + - **Track info**: Direct mapping from CSV columns 266 + - **Timestamps**: Converts Unix timestamps to ISO 8601 267 + - **MusicBrainz IDs**: Preserved when present in CSV 268 + - **URLs**: Generated from artist/track names 269 + - **Artists**: Wrapped in array format with optional MBID 108 270 109 271 ## Lexicon Reference 110 272 111 - This importer follows the lexicon defined in `/lexicons/fm.teal.alpha/feed/play.json`. 273 + This importer follows the official `fm.teal.alpha` lexicon defined in `/lexicons/fm.teal.alpha/feed/play.json`. 274 + 275 + The lexicon defines: 276 + - Required and optional field types 277 + - String length constraints 278 + - Array formats 279 + - Timestamp formatting 280 + - URL validation 281 + 282 + ## Troubleshooting 283 + 284 + ### "Handle not found" 285 + - Verify your ATProto handle is correct (e.g., `alice.bsky.social`) 286 + - Make sure you're using a valid DID or handle 287 + 288 + ### "Invalid credentials" 289 + - Use an **app password**, not your main account password 290 + - Generate app passwords in your account settings 291 + 292 + ### "Rate limit exceeded" 293 + - The importer should prevent this automatically 294 + - If you see this, wait 24 hours before retrying 295 + - Consider reducing batch size or increasing delay 296 + 297 + ### "Connection refused" 298 + - Check your internet connection 299 + - Verify your PDS is accessible 300 + - Some PDSs may have firewall rules 301 + 302 + ### Import seems stuck 303 + - Check progress messages - large imports take time 304 + - Multi-day imports pause for 24 hours between days 305 + - You can safely stop (Ctrl+C) and resume later 306 + 307 + ## Contributing 308 + 309 + Contributions welcome! Please: 310 + 1. Fork the repository 311 + 2. Create a feature branch 312 + 3. Make your changes with tests 313 + 4. Submit a pull request 314 + 315 + ## License 316 + 317 + MIT License - See LICENSE file for details 318 + 319 + ## Credits 320 + 321 + - Uses [@atproto/api](https://www.npmjs.com/package/@atproto/api) for ATProto interactions 322 + - CSV parsing via [csv-parse](https://www.npmjs.com/package/csv-parse) 323 + - Identity resolution via [Slingshot](https://slingshot.danner.cloud) 324 + - Follows the `fm.teal.alpha` lexicon standard 325 + 326 + --- 327 + 328 + **Note**: This tool is for personal use. Respect Last.fm's terms of service and rate limits when exporting your data.
-126
STRUCTURE.md
··· 1 - # Last.fm to ATProto Importer - Modular Structure 2 - 3 - ## Project Structure 4 - 5 - ```plaintext 6 - lastfm-importer/ 7 - ├── src/ 8 - │ ├── index.js # Main entry point 9 - │ ├── config.js # Configuration constants 10 - │ ├── lib/ # Core library modules 11 - │ │ ├── auth.js # Authentication & login 12 - │ │ ├── cli.js # CLI argument parsing & help 13 - │ │ ├── csv.js # CSV parsing & conversion 14 - │ │ └── publisher.js # Record publishing logic 15 - │ └── utils/ # Utility functions 16 - │ ├── helpers.js # Helper functions (formatting, batch calculation) 17 - │ ├── input.js # User input & password masking 18 - │ └── killswitch.js # Graceful shutdown handling 19 - ├── importer.js # Wrapper for backwards compatibility 20 - └── importer.old.js # Original monolithic version (backup) 21 - ``` 22 - 23 - ## Module Responsibilities 24 - 25 - ### `/src/config.js` 26 - 27 - - Configuration constants 28 - - Batch size calculation parameters 29 - - API endpoints and client information 30 - 31 - ### `/src/lib/auth.js` 32 - 33 - - ATProto authentication 34 - - Identity resolution via Slingshot 35 - - Login error handling 36 - 37 - ### `/src/lib/cli.js` 38 - 39 - - Command-line argument parsing 40 - - Help text display 41 - - Input validation 42 - 43 - ### `/src/lib/csv.js` 44 - 45 - - CSV file parsing 46 - - Record conversion to ATProto format 47 - - Chronological sorting 48 - 49 - ### `/src/lib/publisher.js` 50 - 51 - - Batch publishing with rate limiting 52 - - Dry-run preview mode 53 - - Progress tracking and reporting 54 - - Killswitch integration 55 - 56 - ### `/src/utils/helpers.js` 57 - 58 - - Duration formatting 59 - - Optimal batch size calculation (logarithmic algorithm) 60 - - Generic utility functions 61 - 62 - ### `/src/utils/input.js` 63 - 64 - - Interactive prompts 65 - - Password masking with asterisks 66 - - Backspace support 67 - 68 - ### `/src/utils/killswitch.js` 69 - 70 - - SIGINT handler 71 - - Graceful shutdown state management 72 - - Force-quit on second Ctrl+C 73 - 74 - ## Benefits of Modular Structure 75 - 76 - 1. **Maintainability**: Each module has a single responsibility 77 - 2. **Testability**: Individual modules can be tested in isolation 78 - 3. **Reusability**: Modules can be imported and reused 79 - 4. **Readability**: Smaller files are easier to understand 80 - 5. **Collaboration**: Multiple developers can work on different modules 81 - 6. **Debugging**: Easier to locate and fix issues 82 - 83 - ## Usage 84 - 85 - The wrapper file (`importer.js`) maintains backwards compatibility: 86 - 87 - ```bash 88 - # Still works exactly as before 89 - node importer.js -f lastfm.csv -i handle.bsky.social 90 - 91 - # Or use the modular version directly 92 - node src/index.js -f lastfm.csv -i handle.bsky.social 93 - ``` 94 - 95 - ## Algorithm Details 96 - 97 - ### Batch Size Calculation 98 - 99 - Located in `/src/utils/helpers.js`: 100 - 101 - ```javascript 102 - batchSize = BASE + (log2(records/MIN) * SCALING_FACTOR) 103 - ``` 104 - 105 - - **Time Complexity**: O(n) - each record processed once 106 - - **Space Complexity**: O(b) where b is batch size 107 - - **Rate Limit Strategy**: Token bucket approach 108 - - **Adaptive**: Adjusts based on total records and delay settings 109 - 110 - ### Processing Order 111 - 112 - - Default: Chronological (oldest first) 113 - - Option: `--reverse-chronological` for newest first 114 - - Sorted by `playedTime` field 115 - 116 - ## Future Improvements 117 - 118 - With the modular structure, it's now easier to: 119 - 120 - - Add unit tests for each module 121 - - Implement different authentication methods 122 - - Support multiple export formats (JSON, XML) 123 - - Add progress persistence (resume interrupted imports) 124 - - Implement retry logic with exponential backoff 125 - - Add statistics and analytics 126 - - Create a web UI that imports these modules
-5
importer.js
··· 1 - #!/usr/bin/env node 2 - 3 - // Wrapper file for backwards compatibility 4 - // This imports and runs the modular version 5 - import './src/index.js';
+40 -2
package-lock.json
··· 1 1 { 2 2 "name": "lastfm-importer", 3 - "version": "1.0.0", 3 + "version": "0.0.2", 4 4 "lockfileVersion": 3, 5 5 "requires": true, 6 6 "packages": { 7 7 "": { 8 8 "name": "lastfm-importer", 9 - "version": "1.0.0", 9 + "version": "0.0.2", 10 10 "license": "MIT", 11 11 "dependencies": { 12 12 "@atproto/api": "^0.13.0", 13 13 "csv-parse": "^5.5.0" 14 + }, 15 + "bin": { 16 + "lastfm-import": "dist/index.js" 17 + }, 18 + "devDependencies": { 19 + "@types/node": "^20.0.0", 20 + "typescript": "^5.3.0" 14 21 } 15 22 }, 16 23 "node_modules/@atproto/api": { ··· 74 81 "dependencies": { 75 82 "@atproto/lexicon": "^0.4.10", 76 83 "zod": "^3.23.8" 84 + } 85 + }, 86 + "node_modules/@types/node": { 87 + "version": "20.19.25", 88 + "resolved": "https://registry.npmjs.org/@types/node/-/node-20.19.25.tgz", 89 + "integrity": "sha512-ZsJzA5thDQMSQO788d7IocwwQbI8B5OPzmqNvpf3NY/+MHDAS759Wo0gd2WQeXYt5AAAQjzcrTVC6SKCuYgoCQ==", 90 + "dev": true, 91 + "license": "MIT", 92 + "dependencies": { 93 + "undici-types": "~6.21.0" 77 94 } 78 95 }, 79 96 "node_modules/await-lock": { ··· 115 132 "tlds": "bin.js" 116 133 } 117 134 }, 135 + "node_modules/typescript": { 136 + "version": "5.9.3", 137 + "resolved": "https://registry.npmjs.org/typescript/-/typescript-5.9.3.tgz", 138 + "integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==", 139 + "dev": true, 140 + "license": "Apache-2.0", 141 + "bin": { 142 + "tsc": "bin/tsc", 143 + "tsserver": "bin/tsserver" 144 + }, 145 + "engines": { 146 + "node": ">=14.17" 147 + } 148 + }, 118 149 "node_modules/uint8arrays": { 119 150 "version": "3.0.0", 120 151 "resolved": "https://registry.npmjs.org/uint8arrays/-/uint8arrays-3.0.0.tgz", ··· 123 154 "dependencies": { 124 155 "multiformats": "^9.4.2" 125 156 } 157 + }, 158 + "node_modules/undici-types": { 159 + "version": "6.21.0", 160 + "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.21.0.tgz", 161 + "integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ==", 162 + "dev": true, 163 + "license": "MIT" 126 164 }, 127 165 "node_modules/zod": { 128 166 "version": "3.25.76",
+19 -6
package.json
··· 1 1 { 2 2 "name": "lastfm-importer", 3 - "version": "1.0.0", 4 - "description": "Import Last.fm scrobbles to ATProto", 3 + "version": "0.0.2", 4 + "description": "Import Last.fm scrobbles to ATProto with rate limiting", 5 5 "type": "module", 6 - "main": "importer.js", 6 + "main": "./dist/index.js", 7 + "types": "./dist/index.d.ts", 8 + "bin": { 9 + "lastfm-import": "./dist/index.js" 10 + }, 7 11 "scripts": { 8 - "start": "node importer.js", 9 - "dry-run": "node importer.js --dry-run" 12 + "build": "tsc", 13 + "start": "npm run build && node dist/index.js", 14 + "dev": "tsc && node dist/index.js", 15 + "dry-run": "npm run build && node dist/index.js --dry-run", 16 + "clean": "rm -rf dist", 17 + "type-check": "tsc --noEmit" 10 18 }, 11 19 "keywords": [ 12 20 "lastfm", 13 21 "atproto", 14 22 "bluesky", 15 - "import" 23 + "import", 24 + "typescript" 16 25 ], 17 26 "author": "", 18 27 "license": "MIT", 19 28 "dependencies": { 20 29 "@atproto/api": "^0.13.0", 21 30 "csv-parse": "^5.5.0" 31 + }, 32 + "devDependencies": { 33 + "@types/node": "^20.0.0", 34 + "typescript": "^5.3.0" 22 35 } 23 36 }
-16
src/config.js
··· 1 - /** 2 - * Configuration constants for the Last.fm importer 3 - */ 4 - 5 - export const DEFAULT_BATCH_SIZE = 10; 6 - export const DEFAULT_BATCH_DELAY = 1500; 7 - export const MIN_BATCH_DELAY = 100; 8 - export const RECORD_TYPE = 'fm.teal.alpha.feed.play'; 9 - export const SLINGSHOT_RESOLVER = 'https://slingshot.microcosm.blue/xrpc/com.bad-example.identity.resolveMiniDoc'; 10 - export const CLIENT_AGENT = 'lastfm-importer/v0.0.1'; 11 - 12 - // Batch size calculation constants 13 - export const MIN_RECORDS_FOR_SCALING = 100; 14 - export const BASE_BATCH_SIZE = 5; 15 - export const MAX_BATCH_SIZE = 50; 16 - export const SCALING_FACTOR = 1.5;
+49
src/config.ts
··· 1 + import type { Config } from './types.js'; 2 + 3 + // ⚠️ IMPORTANT: Rate Limit Warning 4 + // Bluesky's AppView has rate limits on PDS instances: 5 + // - Exceeding 10K records per day can rate limit your ENTIRE PDS 6 + // - This affects all users on your PDS, not just your account 7 + // - See: https://docs.bsky.app/blog/rate-limits-pds-v3 8 + // 9 + // Default limit: 1K records per day (automatically batched with pauses) 10 + export const RECORDS_PER_DAY_LIMIT = 1000; 11 + 12 + // Safety margin factor (0.9 = use 90% of limit to be safe) 13 + export const SAFETY_MARGIN = 0.9; 14 + 15 + // Record type 16 + export const RECORD_TYPE = 'fm.teal.alpha.feed.play'; 17 + 18 + // Client agent 19 + export const CLIENT_AGENT = 'lastfm-importer/v0.0.2'; 20 + 21 + // Default batch configuration (will be adjusted for rate limiting) 22 + export const DEFAULT_BATCH_SIZE = 10; 23 + export const DEFAULT_BATCH_DELAY = 2000; // 2 seconds 24 + 25 + // Minimum safe delay between batches (1 second) 26 + export const MIN_BATCH_DELAY = 1000; 27 + 28 + // Maximum batch size 29 + export const MAX_BATCH_SIZE = 50; 30 + 31 + // Slingshot resolver URL 32 + export const SLINGSHOT_RESOLVER = 'https://slingshot.danner.cloud'; 33 + 34 + const config: Config = { 35 + RECORD_TYPE, 36 + MIN_RECORDS_FOR_SCALING: 20, 37 + BASE_BATCH_SIZE: 10, 38 + SCALING_FACTOR: 1.5, 39 + CLIENT_AGENT, 40 + DEFAULT_BATCH_SIZE, 41 + DEFAULT_BATCH_DELAY, 42 + MIN_BATCH_DELAY, 43 + MAX_BATCH_SIZE, 44 + SLINGSHOT_RESOLVER, 45 + RECORDS_PER_DAY_LIMIT, 46 + SAFETY_MARGIN, 47 + }; 48 + 49 + export default config;
-155
src/index.js
··· 1 - #!/usr/bin/env node 2 - 3 - import * as fs from 'fs'; 4 - import * as config from './config.js'; 5 - import { parseCommandLineArgs, showHelp } from './lib/cli.js'; 6 - import { login } from './lib/auth.js'; 7 - import { parseLastFmCsv, convertToPlayRecord, sortRecords } from './lib/csv.js'; 8 - import { publishRecords } from './lib/publisher.js'; 9 - import { prompt } from './utils/input.js'; 10 - import { formatDuration, calculateOptimalBatchSize } from './utils/helpers.js'; 11 - import { setupKillswitch } from './utils/killswitch.js'; 12 - 13 - /** 14 - * Main execution 15 - */ 16 - async function main() { 17 - const args = parseCommandLineArgs(); 18 - 19 - // Show help if requested 20 - if (args.help) { 21 - showHelp(); 22 - process.exit(0); 23 - } 24 - 25 - // Setup killswitch (unless in dry-run mode) 26 - if (!args['dry-run']) { 27 - setupKillswitch(); 28 - } 29 - 30 - try { 31 - console.log('=== Last.fm to ATProto Importer ===\n'); 32 - 33 - // Get CSV file path 34 - let csvPath = args.file; 35 - if (!csvPath) { 36 - csvPath = await prompt('Enter path to Last.fm CSV export: '); 37 - } else { 38 - console.log(`CSV file: ${csvPath}`); 39 - } 40 - 41 - if (!fs.existsSync(csvPath)) { 42 - console.error('✗ File not found!'); 43 - process.exit(1); 44 - } 45 - 46 - // Parse CSV 47 - const csvRecords = parseLastFmCsv(csvPath); 48 - 49 - if (csvRecords.length === 0) { 50 - console.error('✗ No records found in CSV file!'); 51 - process.exit(1); 52 - } 53 - 54 - // Convert records 55 - console.log('Converting records to ATProto format...'); 56 - const playRecords = csvRecords.map(record => convertToPlayRecord(record, config)); 57 - console.log('✓ Conversion complete\n'); 58 - 59 - // Sort records chronologically 60 - const reverseChronological = args['reverse-chronological']; 61 - sortRecords(playRecords, reverseChronological); 62 - 63 - // Validate and set batch delay 64 - let batchDelay = args['batch-delay'] ? parseInt(args['batch-delay']) : config.DEFAULT_BATCH_DELAY; 65 - if (batchDelay < config.MIN_BATCH_DELAY) { 66 - console.log(`⚠️ Batch delay ${batchDelay}ms is below minimum safe limit.`); 67 - console.log(` Enforcing minimum delay of ${config.MIN_BATCH_DELAY}ms to respect rate limits.\n`); 68 - batchDelay = config.MIN_BATCH_DELAY; 69 - } 70 - 71 - // Calculate optimal batch size 72 - let batchSize = args['batch-size'] ? parseInt(args['batch-size']) : null; 73 - if (!batchSize) { 74 - batchSize = calculateOptimalBatchSize(playRecords.length, batchDelay, config); 75 - console.log(`Auto-calculated batch size: ${batchSize}`); 76 - console.log(` Algorithm: Logarithmic scaling with O(n) time complexity`); 77 - console.log(` Optimized for: ${playRecords.length} records at ${batchDelay}ms delay`); 78 - console.log(` Rate limit strategy: Token bucket with conservative limits\n`); 79 - } else { 80 - console.log(`Using specified batch size: ${batchSize}\n`); 81 - } 82 - 83 - // Check if dry run mode 84 - const isDryRun = args['dry-run']; 85 - 86 - if (isDryRun) { 87 - console.log('🔍 Running in DRY RUN mode - no authentication required\n'); 88 - 89 - // Show preview without publishing 90 - await publishRecords(null, playRecords, batchSize, batchDelay, config, true); 91 - process.exit(0); 92 - } 93 - 94 - // Login to ATProto (only if not dry run) 95 - const agent = await login(args.identifier, args.password, config.SLINGSHOT_RESOLVER); 96 - 97 - // Confirm before publishing (unless --yes flag is set) 98 - if (!args.yes) { 99 - const confirm = await prompt(`\nReady to publish ${playRecords.length} records. Continue? (yes/no): `); 100 - if (confirm.toLowerCase() !== 'yes' && confirm.toLowerCase() !== 'y') { 101 - console.log('Aborted.'); 102 - process.exit(0); 103 - } 104 - console.log(''); 105 - } else { 106 - console.log(`Auto-confirmed: Publishing ${playRecords.length} records...\n`); 107 - } 108 - 109 - // Publish records 110 - const startTime = Date.now(); 111 - const { successCount, errorCount, cancelled } = await publishRecords( 112 - agent, 113 - playRecords, 114 - batchSize, 115 - batchDelay, 116 - config, 117 - false 118 - ); 119 - const totalTime = formatDuration(Date.now() - startTime); 120 - 121 - // Summary 122 - console.log('=== Import Complete ==='); 123 - if (cancelled) { 124 - console.log('Status: CANCELLED BY USER'); 125 - } else { 126 - console.log('Status: COMPLETED'); 127 - } 128 - console.log(`Total records: ${playRecords.length}`); 129 - console.log(`Successfully published: ${successCount}`); 130 - console.log(`Failed: ${errorCount}`); 131 - if (cancelled) { 132 - console.log(`Not processed: ${playRecords.length - successCount - errorCount}`); 133 - } 134 - console.log(`Total time: ${totalTime}`); 135 - 136 - if (successCount > 0) { 137 - const avgTime = (Date.now() - startTime) / successCount; 138 - console.log(`Average time per record: ${avgTime.toFixed(0)}ms`); 139 - } 140 - 141 - console.log('\n✓ Logged out'); 142 - 143 - // Exit with appropriate code 144 - process.exit(cancelled ? 130 : 0); 145 - 146 - } catch (error) { 147 - console.error('\n✗ Fatal error:', error.message); 148 - if (error.stack && process.env.DEBUG) { 149 - console.error('\nStack trace:', error.stack); 150 - } 151 - process.exit(1); 152 - } 153 - } 154 - 155 - main();
+5
src/index.ts
··· 1 + #!/usr/bin/env node 2 + 3 + import { runCLI } from './lib/cli.js'; 4 + 5 + runCLI();
+19 -9
src/lib/auth.js src/lib/auth.ts
··· 1 1 import { AtpAgent } from '@atproto/api'; 2 2 import { prompt } from '../utils/input.js'; 3 3 4 + interface ResolverResponse { 5 + did: string; 6 + pds: string; 7 + } 8 + 4 9 /** 5 10 * Resolves an AT Protocol identifier (handle or DID) to get PDS information 6 11 */ 7 - async function resolveIdentifier(identifier, resolverUrl) { 12 + async function resolveIdentifier(identifier: string, resolverUrl: string): Promise<ResolverResponse> { 8 13 console.log(`Resolving identifier: ${identifier}`); 9 14 10 15 const response = await fetch( ··· 15 20 throw new Error(`Failed to resolve identifier: ${response.status} ${response.statusText}`); 16 21 } 17 22 18 - const data = await response.json(); 23 + const data = await response.json() as ResolverResponse; 19 24 20 25 if (!data.did || !data.pds) { 21 26 throw new Error('Invalid response from identity resolver'); ··· 28 33 /** 29 34 * Login to ATProto using Slingshot resolver 30 35 */ 31 - export async function login(identifier, password, resolverUrl) { 36 + export async function login( 37 + identifier: string | undefined, 38 + password: string | undefined, 39 + resolverUrl: string 40 + ): Promise<AtpAgent> { 32 41 console.log('\n=== ATProto Login ==='); 33 42 34 43 // Prompt for missing credentials ··· 58 67 }); 59 68 60 69 console.log('✓ Logged in successfully!'); 61 - console.log(` DID: ${pdsAgent.session.did}`); 62 - console.log(` Handle: ${pdsAgent.session.handle}\n`); 70 + console.log(` DID: ${pdsAgent.session?.did}`); 71 + console.log(` Handle: ${pdsAgent.session?.handle}\n`); 63 72 64 73 return pdsAgent; 65 74 } catch (error) { 66 - console.error('✗ Login failed:', error.message); 75 + const err = error as Error; 76 + console.error('✗ Login failed:', err.message); 67 77 68 78 // Provide more specific error messages 69 - if (error.message.includes('Failed to resolve identifier')) { 79 + if (err.message.includes('Failed to resolve identifier')) { 70 80 throw new Error('Handle not found. Please check your AT Protocol handle.'); 71 - } else if (error.message.includes('AuthFactorTokenRequired')) { 81 + } else if (err.message.includes('AuthFactorTokenRequired')) { 72 82 throw new Error('Two-factor authentication required. Please use your app password.'); 73 - } else if (error.message.includes('InvalidCredentials')) { 83 + } else if (err.message.includes('InvalidCredentials')) { 74 84 throw new Error('Invalid credentials. Please check your handle and app password.'); 75 85 } 76 86
-93
src/lib/cli.js
··· 1 - import { parseArgs } from 'node:util'; 2 - 3 - /** 4 - * Parse command line arguments 5 - */ 6 - export function parseCommandLineArgs() { 7 - const options = { 8 - help: { 9 - type: 'boolean', 10 - short: 'h', 11 - default: false, 12 - }, 13 - file: { 14 - type: 'string', 15 - short: 'f', 16 - }, 17 - identifier: { 18 - type: 'string', 19 - short: 'i', 20 - }, 21 - password: { 22 - type: 'string', 23 - short: 'p', 24 - }, 25 - 'batch-size': { 26 - type: 'string', 27 - short: 'b', 28 - }, 29 - 'batch-delay': { 30 - type: 'string', 31 - short: 'd', 32 - }, 33 - yes: { 34 - type: 'boolean', 35 - short: 'y', 36 - default: false, 37 - }, 38 - 'dry-run': { 39 - type: 'boolean', 40 - short: 'n', 41 - default: false, 42 - }, 43 - 'reverse-chronological': { 44 - type: 'boolean', 45 - short: 'r', 46 - default: false, 47 - }, 48 - }; 49 - 50 - try { 51 - const { values } = parseArgs({ options, allowPositionals: false }); 52 - return values; 53 - } catch (error) { 54 - console.error('Error parsing arguments:', error.message); 55 - showHelp(); 56 - process.exit(1); 57 - } 58 - } 59 - 60 - /** 61 - * Show help message 62 - */ 63 - export function showHelp() { 64 - console.log(` 65 - Last.fm to ATProto Importer 66 - 67 - Usage: node importer.js [options] 68 - 69 - Options: 70 - -h, --help Show this help message 71 - -f, --file <path> Path to Last.fm CSV export file 72 - -i, --identifier <id> ATProto handle or DID 73 - -p, --password <pass> ATProto app password 74 - -b, --batch-size <num> Number of records per batch (auto-calculated if not set) 75 - -d, --batch-delay <ms> Delay between batches in ms (default: 2000, min: 1000) 76 - -y, --yes Skip confirmation prompt 77 - -n, --dry-run Preview records without publishing 78 - -r, --reverse-chronological Process newest first (default: oldest first) 79 - 80 - Examples: 81 - node importer.js -f lastfm.csv -i alice.bsky.social -p xxxx-xxxx-xxxx-xxxx 82 - node importer.js --file export.csv --identifier alice.bsky.social --yes 83 - node importer.js -f lastfm.csv --dry-run 84 - node importer.js (interactive mode - prompts for all values) 85 - 86 - Notes: 87 - - Batch size uses logarithmic scaling algorithm (O(n) complexity) for optimal throughput 88 - - Auto-calculated batch size considers both record count and delay settings 89 - - Records are processed in chronological order (oldest first) by default 90 - - Minimum batch delay of 1000ms enforced to respect rate limits 91 - - Rate limiting follows token bucket strategy for safe API usage 92 - `); 93 - }
+173
src/lib/cli.ts
··· 1 + import { parseArgs } from 'node:util'; 2 + import { AtpAgent } from '@atproto/api'; // Use AtpAgent for consistency 3 + import type { PlayRecord, Config, CommandLineArgs, PublishResult } from '../types.js'; 4 + import { login } from './auth.js'; 5 + import { parseLastFmCsv, convertToPlayRecord, sortRecords } from '../lib/csv.js'; 6 + import { publishRecords } from './publisher.js'; 7 + import { prompt } from '../utils/input.js'; 8 + import config from '../config.js'; 9 + import { calculateOptimalBatchSize, showRateLimitInfo } from '../utils/helpers.js'; 10 + 11 + /** 12 + * Show help message 13 + */ 14 + export function showHelp(): void { 15 + console.log(` 16 + Last.fm to ATProto Importer v0.0.2 17 + 18 + Usage: npm start [options] 19 + 20 + Options: 21 + -h, --help Show this help message 22 + -f, --file <path> Path to Last.fm CSV export file 23 + -i, --identifier <id> ATProto handle or DID 24 + -p, --password <pass> ATProto app password 25 + -b, --batch-size <num> Number of records per batch (auto-calculated if not set) 26 + -d, --batch-delay <ms> Delay between batches in ms (default: 2000, min: 1000) 27 + -y, --yes Skip confirmation prompt 28 + -n, --dry-run Preview records without publishing 29 + -r, --reverse-chronological Process newest first (default: oldest first) 30 + `); 31 + } 32 + 33 + /** 34 + * Parse command line arguments 35 + */ 36 + export function parseCommandLineArgs(): CommandLineArgs { 37 + // The options definition is identical to the CommandLineArgs keys 38 + const options = { 39 + help: { type: 'boolean', short: 'h', default: false }, 40 + file: { type: 'string', short: 'f' }, 41 + identifier: { type: 'string', short: 'i' }, 42 + password: { type: 'string', short: 'p' }, 43 + 'batch-size': { type: 'string', short: 'b' }, 44 + 'batch-delay': { type: 'string', short: 'd' }, 45 + yes: { type: 'boolean', short: 'y', default: false }, 46 + 'dry-run': { type: 'boolean', short: 'n', default: false }, 47 + 'reverse-chronological': { type: 'boolean', short: 'r', default: false }, 48 + } as const; 49 + 50 + try { 51 + const { values } = parseArgs({ options, allowPositionals: false }); 52 + return values as CommandLineArgs; 53 + } catch (error) { 54 + const err = error as Error; 55 + console.error('Error parsing arguments:', err.message); 56 + showHelp(); 57 + process.exit(1); 58 + } 59 + } 60 + 61 + /** 62 + * The full, real implementation of the CLI 63 + */ 64 + export async function runCLI(): Promise<void> { 65 + try { 66 + const args = parseCommandLineArgs(); 67 + const cfg = config as Config; // Use a constant for the typed config 68 + 69 + if (args.help) { 70 + showHelp(); 71 + return; 72 + } 73 + 74 + if (!args.file) { 75 + throw new Error('Missing required argument: -f, --file <path>'); 76 + } 77 + 78 + const dryRun = args['dry-run'] ?? false; 79 + let agent: AtpAgent | null = null; 80 + 81 + // 1. Get Authentication (skips login if dry-run) 82 + if (!dryRun) { 83 + if (!args.identifier || !args.password) { 84 + throw new Error('Missing required arguments for login: -i (identifier) and -p (password)'); 85 + } 86 + // Assume login returns AtpAgent, as per the type fix 87 + agent = await login(args.identifier, args.password, cfg.SLINGSHOT_RESOLVER) as AtpAgent; 88 + } 89 + 90 + // 2. Parse and Prepare Records 91 + // This function is assumed to read the file path in args.file 92 + const csvRecords = parseLastFmCsv(args.file); 93 + 94 + // This function maps the raw CSV records to the standardized PlayRecord structure 95 + const records: PlayRecord[] = csvRecords.map(record => convertToPlayRecord(record, cfg)); 96 + const totalRecords = records.length; 97 + 98 + const reverseChronological = args['reverse-chronological'] ?? false; 99 + const sortedRecords = sortRecords(records, reverseChronological); 100 + 101 + // 3. Determine Batching parameters 102 + let batchDelay = cfg.DEFAULT_BATCH_DELAY; 103 + if (args['batch-delay']) { 104 + const delay = parseInt(args['batch-delay'], 10); 105 + if (isNaN(delay)) { 106 + throw new Error(`Invalid batch delay value: ${args['batch-delay']}`); 107 + } 108 + // Enforce minimum delay 109 + batchDelay = Math.max(delay, cfg.MIN_BATCH_DELAY); 110 + } 111 + 112 + let batchSize: number; 113 + if (args['batch-size']) { 114 + batchSize = parseInt(args['batch-size'], 10); 115 + if (isNaN(batchSize) || batchSize <= 0) { 116 + throw new Error(`Invalid batch size value: ${args['batch-size']}`); 117 + } 118 + } else { 119 + // Calculate optimal batch size if not provided 120 + batchSize = calculateOptimalBatchSize(totalRecords, batchDelay, cfg); 121 + } 122 + 123 + // 4. Show Rate Limiting Information 124 + const recordsPerDay = cfg.RECORDS_PER_DAY_LIMIT * cfg.SAFETY_MARGIN; 125 + const estimatedDays = Math.ceil(totalRecords / recordsPerDay); 126 + 127 + // Updated call to match the expected signature in showRateLimitInfo (from previous response) 128 + showRateLimitInfo( 129 + totalRecords, 130 + batchSize, 131 + batchDelay, 132 + estimatedDays, 133 + cfg.RECORDS_PER_DAY_LIMIT, 134 + ); 135 + 136 + // 5. Confirmation Prompt 137 + if (!dryRun && !(args.yes ?? false)) { 138 + console.log(`\nReady to publish ${totalRecords.toLocaleString()} records.`); 139 + const answer = await prompt('Do you want to continue? (y/N) '); 140 + if (answer.toLowerCase() !== 'y') { 141 + console.log('Import cancelled by user.'); 142 + process.exit(0); 143 + } 144 + } 145 + 146 + // 6. Publish Records 147 + const result: PublishResult = await publishRecords( 148 + agent, 149 + sortedRecords, 150 + batchSize, 151 + batchDelay, 152 + cfg, 153 + dryRun 154 + ); 155 + 156 + // 7. Final Output 157 + if (result.cancelled) { 158 + console.log(`\nImport stopped gracefully. ${result.successCount} records processed.`); 159 + } else if (dryRun) { 160 + console.log('\nDRY RUN COMPLETE. No records were published.'); 161 + } else { 162 + console.log(`\n🎉 Import Complete!`); 163 + console.log(`Total records processed: ${result.successCount.toLocaleString()} (${result.errorCount.toLocaleString()} failed)`); 164 + } 165 + 166 + } catch (error) { 167 + // Handle fatal errors 168 + const err = error as Error; 169 + console.error('\n🛑 A fatal error occurred:'); 170 + console.error(err.message); 171 + process.exit(1); 172 + } 173 + }
+9 -7
src/lib/csv.js src/lib/csv.ts
··· 1 1 import * as fs from 'fs'; 2 2 import { parse } from 'csv-parse/sync'; 3 + import type { LastFmCsvRecord, PlayRecord, Config } from '../types.js'; 3 4 4 5 /** 5 6 * Parse Last.fm CSV export 6 7 */ 7 - export function parseLastFmCsv(filePath) { 8 + export function parseLastFmCsv(filePath: string): LastFmCsvRecord[] { 8 9 console.log(`Reading CSV file: ${filePath}`); 9 10 const fileContent = fs.readFileSync(filePath, 'utf-8'); 10 11 ··· 12 13 columns: true, 13 14 skip_empty_lines: true, 14 15 trim: true, 15 - }); 16 + }) as LastFmCsvRecord[]; 16 17 17 18 console.log(`✓ Parsed ${records.length} scrobbles\n`); 18 19 return records; ··· 21 22 /** 22 23 * Convert Last.fm CSV record to ATProto play record 23 24 */ 24 - export function convertToPlayRecord(csvRecord, config) { 25 + export function convertToPlayRecord(csvRecord: LastFmCsvRecord, config: Config): PlayRecord { 25 26 const { RECORD_TYPE, CLIENT_AGENT } = config; 26 27 27 28 // Parse the timestamp ··· 29 30 const playedTime = new Date(timestamp * 1000).toISOString(); 30 31 31 32 // Build artists array 32 - const artists = []; 33 + const artists: PlayRecord['artists'] = []; 33 34 if (csvRecord.artist) { 34 - const artistData = { 35 + const artistData: PlayRecord['artists'][0] = { 35 36 artistName: csvRecord.artist, 36 37 }; 37 38 if (csvRecord.artist_mbid && csvRecord.artist_mbid.trim()) { ··· 41 42 } 42 43 43 44 // Build the play record 44 - const playRecord = { 45 + const playRecord: PlayRecord = { 45 46 $type: RECORD_TYPE, 46 47 trackName: csvRecord.track, 47 48 artists, 48 49 playedTime, 49 50 submissionClientAgent: CLIENT_AGENT, 50 51 musicServiceBaseDomain: 'last.fm', 52 + originUrl: '', 51 53 }; 52 54 53 55 // Add optional fields ··· 74 76 /** 75 77 * Sort records chronologically 76 78 */ 77 - export function sortRecords(records, reverseChronological = false) { 79 + export function sortRecords(records: PlayRecord[], reverseChronological = false): PlayRecord[] { 78 80 console.log(`Sorting records ${reverseChronological ? 'newest' : 'oldest'} first...`); 79 81 80 82 records.sort((a, b) => {
-137
src/lib/publisher.js
··· 1 - import { formatDuration } from '../utils/helpers.js'; 2 - import { isImportCancelled } from '../utils/killswitch.js'; 3 - 4 - /** 5 - * Publish records in batches with rate limiting and killswitch support 6 - */ 7 - export async function publishRecords(agent, records, batchSize, batchDelay, config, dryRun = false) { 8 - const { RECORD_TYPE } = config; 9 - const totalRecords = records.length; 10 - let successCount = 0; 11 - let errorCount = 0; 12 - const startTime = Date.now(); 13 - 14 - if (dryRun) { 15 - return handleDryRun(records, batchSize, batchDelay); 16 - } 17 - 18 - const totalBatches = Math.ceil(totalRecords / batchSize); 19 - const estimatedTime = formatDuration(totalBatches * batchDelay); 20 - 21 - console.log(`Publishing ${totalRecords} records in batches of ${batchSize}...`); 22 - console.log(`Total batches: ${totalBatches}`); 23 - console.log(`Estimated time: ${estimatedTime}`); 24 - console.log(`\n🚨 Press Ctrl+C to stop gracefully after current batch\n`); 25 - 26 - for (let i = 0; i < totalRecords; i += batchSize) { 27 - // Check killswitch before processing batch 28 - if (isImportCancelled()) { 29 - return handleCancellation(successCount, errorCount, totalRecords); 30 - } 31 - 32 - const batch = records.slice(i, i + batchSize); 33 - const batchNum = Math.floor(i / batchSize) + 1; 34 - const progress = ((i / totalRecords) * 100).toFixed(1); 35 - 36 - console.log(`[${progress}%] Batch ${batchNum}/${totalBatches} (records ${i + 1}-${Math.min(i + batchSize, totalRecords)})`); 37 - 38 - // Process batch records 39 - const batchStartTime = Date.now(); 40 - for (const record of batch) { 41 - // Check killswitch during batch processing 42 - if (isImportCancelled()) { 43 - console.log(` ⚠️ Stopping mid-batch...`); 44 - break; 45 - } 46 - 47 - try { 48 - await agent.com.atproto.repo.createRecord({ 49 - repo: agent.session.did, 50 - collection: RECORD_TYPE, 51 - record, 52 - }); 53 - successCount++; 54 - } catch (error) { 55 - errorCount++; 56 - console.error(` ✗ Failed: ${record.trackName} - ${error.message}`); 57 - } 58 - } 59 - 60 - const batchDuration = Date.now() - batchStartTime; 61 - const elapsed = formatDuration(Date.now() - startTime); 62 - const remaining = formatDuration(((totalRecords - i - batchSize) / batchSize) * batchDelay); 63 - 64 - console.log(` ✓ Complete in ${batchDuration}ms (${successCount} successful, ${errorCount} failed)`); 65 - 66 - // Only show time estimates if not cancelled 67 - if (!isImportCancelled()) { 68 - console.log(` ⏱ Elapsed: ${elapsed} | Remaining: ~${remaining}\n`); 69 - } 70 - 71 - // Check again before waiting 72 - if (isImportCancelled()) { 73 - return handleCancellation(successCount, errorCount, totalRecords); 74 - } 75 - 76 - // Wait before next batch (except for last batch) 77 - if (i + batchSize < totalRecords) { 78 - await new Promise(resolve => setTimeout(resolve, batchDelay)); 79 - } 80 - } 81 - 82 - return { successCount, errorCount, cancelled: false }; 83 - } 84 - 85 - /** 86 - * Handle dry run mode 87 - */ 88 - function handleDryRun(records, batchSize, batchDelay) { 89 - const totalRecords = records.length; 90 - 91 - console.log(`\n=== DRY RUN MODE ===`); 92 - console.log(`Would publish ${totalRecords} records in batches of ${batchSize}`); 93 - console.log(`Estimated time: ${formatDuration(Math.ceil(totalRecords / batchSize) * batchDelay)}\n`); 94 - 95 - // Show first 5 records as preview 96 - const previewCount = Math.min(5, totalRecords); 97 - console.log(`Preview of first ${previewCount} records (in processing order):\n`); 98 - 99 - for (let i = 0; i < previewCount; i++) { 100 - const record = records[i]; 101 - console.log(`${i + 1}. ${record.artists[0]?.artistName} - ${record.trackName}`); 102 - console.log(` Album: ${record.releaseName || 'N/A'}`); 103 - console.log(` Played: ${record.playedTime}`); 104 - console.log(` URL: ${record.originUrl}`); 105 - 106 - // Show MusicBrainz IDs if available 107 - const mbids = []; 108 - if (record.artists[0]?.artistMbId) mbids.push(`Artist: ${record.artists[0].artistMbId}`); 109 - if (record.recordingMbId) mbids.push(`Recording: ${record.recordingMbId}`); 110 - if (record.releaseMbId) mbids.push(`Release: ${record.releaseMbId}`); 111 - 112 - if (mbids.length > 0) { 113 - console.log(` MBIDs: ${mbids.join(', ')}`); 114 - } 115 - console.log(''); 116 - } 117 - 118 - if (totalRecords > previewCount) { 119 - console.log(`... and ${totalRecords - previewCount} more records\n`); 120 - } 121 - 122 - console.log('=== DRY RUN COMPLETE ==='); 123 - console.log('No records were actually published.'); 124 - console.log('Remove --dry-run flag to publish for real.\n'); 125 - 126 - return { successCount: totalRecords, errorCount: 0, cancelled: false }; 127 - } 128 - 129 - /** 130 - * Handle cancellation 131 - */ 132 - function handleCancellation(successCount, errorCount, totalRecords) { 133 - console.log(`\n🛑 Import cancelled by user`); 134 - console.log(` Processed: ${successCount}/${totalRecords} records`); 135 - console.log(` Remaining: ${totalRecords - successCount} records\n`); 136 - return { successCount, errorCount, cancelled: true }; 137 - }
+326
src/lib/publisher.ts
··· 1 + import type { AtpAgent } from '@atproto/api'; 2 + import { formatDuration } from '../utils/helpers.js'; 3 + import { isImportCancelled } from '../utils/killswitch.js'; 4 + import { 5 + calculateDailySchedule, 6 + displayRateLimitWarning, 7 + displayRateLimitInfo, 8 + calculateRateLimitedBatches, 9 + } from '../utils/rate-limiter.js'; 10 + import type { PlayRecord, Config, PublishResult } from '../types.js'; 11 + 12 + /** 13 + * Publish records in batches with rate limiting and multi-day support 14 + */ 15 + export async function publishRecords( 16 + agent: AtpAgent | null, 17 + records: PlayRecord[], 18 + batchSize: number, 19 + batchDelay: number, 20 + config: Config, 21 + dryRun = false 22 + ): Promise<PublishResult> { 23 + const { RECORD_TYPE } = config; 24 + const totalRecords = records.length; 25 + 26 + if (dryRun) { 27 + return handleDryRun(records, batchSize, batchDelay, config); 28 + } 29 + 30 + if (!agent) { 31 + throw new Error('Agent is required for publishing'); 32 + } 33 + 34 + // Calculate rate-limited batch parameters 35 + const rateLimitParams = calculateRateLimitedBatches(totalRecords, config); 36 + 37 + // Override with calculated parameters if rate limiting is needed 38 + if (rateLimitParams.needsRateLimiting) { 39 + displayRateLimitWarning(); 40 + batchSize = rateLimitParams.batchSize; 41 + batchDelay = rateLimitParams.batchDelay; 42 + } 43 + 44 + displayRateLimitInfo( 45 + totalRecords, 46 + batchSize, 47 + batchDelay, 48 + rateLimitParams.estimatedDays, 49 + rateLimitParams.recordsPerDay 50 + ); 51 + 52 + // Calculate daily schedule if multi-day import 53 + const dailySchedule = 54 + rateLimitParams.estimatedDays > 1 55 + ? calculateDailySchedule( 56 + totalRecords, 57 + batchSize, 58 + batchDelay, 59 + rateLimitParams.recordsPerDay 60 + ) 61 + : null; 62 + 63 + let successCount = 0; 64 + let errorCount = 0; 65 + const startTime = Date.now(); 66 + 67 + const totalBatches = Math.ceil(totalRecords / batchSize); 68 + const estimatedTime = formatDuration(totalBatches * batchDelay); 69 + 70 + console.log(`Publishing ${totalRecords} records in batches of ${batchSize}...`); 71 + console.log(`Total batches: ${totalBatches}`); 72 + if (!dailySchedule) { 73 + console.log(`Estimated time: ${estimatedTime}`); 74 + } 75 + console.log(`\n🚨 Press Ctrl+C to stop gracefully after current batch\n`); 76 + 77 + // If multi-day, process day by day 78 + if (dailySchedule) { 79 + for (const day of dailySchedule) { 80 + console.log(`\n╔═══════════════════════════════════════════════════════════════╗`); 81 + console.log(`║ DAY ${day.day} of ${rateLimitParams.estimatedDays}`); 82 + console.log(`║ Records: ${day.recordsStart + 1}-${day.recordsEnd} (${day.recordsCount} total)`); 83 + console.log(`╚═══════════════════════════════════════════════════════════════╝\n`); 84 + 85 + const dayRecords = records.slice(day.recordsStart, day.recordsEnd); 86 + const result = await processDayBatch( 87 + agent, 88 + dayRecords, 89 + batchSize, 90 + batchDelay, 91 + RECORD_TYPE, 92 + day.recordsStart, 93 + totalRecords, 94 + startTime 95 + ); 96 + 97 + successCount += result.successCount; 98 + errorCount += result.errorCount; 99 + 100 + if (result.cancelled) { 101 + return { successCount, errorCount, cancelled: true }; 102 + } 103 + 104 + // Pause between days 105 + if (day.pauseAfter) { 106 + console.log(`\n⏸️ Pausing for 24 hours before continuing...`); 107 + console.log(` Next batch will start at: ${new Date(Date.now() + day.pauseDuration).toLocaleString()}`); 108 + console.log(` Progress: ${successCount}/${totalRecords} records completed\n`); 109 + console.log(` 💡 You can safely stop (Ctrl+C) and restart later.\n`); 110 + 111 + await new Promise((resolve) => setTimeout(resolve, day.pauseDuration)); 112 + } 113 + } 114 + } else { 115 + // Single day import - process normally 116 + const result = await processDayBatch( 117 + agent, 118 + records, 119 + batchSize, 120 + batchDelay, 121 + RECORD_TYPE, 122 + 0, 123 + totalRecords, 124 + startTime 125 + ); 126 + 127 + successCount = result.successCount; 128 + errorCount = result.errorCount; 129 + 130 + if (result.cancelled) { 131 + return { successCount, errorCount, cancelled: true }; 132 + } 133 + } 134 + 135 + return { successCount, errorCount, cancelled: false }; 136 + } 137 + 138 + /** 139 + * Process a batch of records (for a single day or entire import) 140 + */ 141 + async function processDayBatch( 142 + agent: AtpAgent, 143 + records: PlayRecord[], 144 + batchSize: number, 145 + batchDelay: number, 146 + recordType: string, 147 + globalOffset: number, 148 + totalRecords: number, 149 + startTime: number 150 + ): Promise<PublishResult> { 151 + let successCount = 0; 152 + let errorCount = 0; 153 + 154 + for (let i = 0; i < records.length; i += batchSize) { 155 + // Check killswitch before processing batch 156 + if (isImportCancelled()) { 157 + return handleCancellation(successCount, errorCount, totalRecords); 158 + } 159 + 160 + const batch = records.slice(i, i + batchSize); 161 + const globalIndex = globalOffset + i; 162 + const batchNum = Math.floor(globalIndex / batchSize) + 1; 163 + const progress = (((globalOffset + i) / totalRecords) * 100).toFixed(1); 164 + 165 + console.log( 166 + `[${progress}%] Batch ${batchNum} (records ${globalOffset + i + 1}-${Math.min(globalOffset + i + batchSize, globalOffset + records.length)})` 167 + ); 168 + 169 + // Process batch records 170 + const batchStartTime = Date.now(); 171 + for (const record of batch) { 172 + // Check killswitch during batch processing 173 + if (isImportCancelled()) { 174 + console.log(` ⚠️ Stopping mid-batch...`); 175 + break; 176 + } 177 + 178 + try { 179 + await agent.com.atproto.repo.createRecord({ 180 + repo: agent.session?.did || '', 181 + collection: recordType, 182 + record, 183 + }); 184 + successCount++; 185 + } catch (error) { 186 + errorCount++; 187 + const err = error as Error; 188 + console.error(` ✗ Failed: ${record.trackName} - ${err.message}`); 189 + } 190 + } 191 + 192 + const batchDuration = Date.now() - batchStartTime; 193 + const elapsed = formatDuration(Date.now() - startTime); 194 + const remaining = formatDuration( 195 + ((totalRecords - (globalOffset + i + batchSize)) / batchSize) * batchDelay 196 + ); 197 + 198 + console.log( 199 + ` ✓ Complete in ${batchDuration}ms (${successCount} successful, ${errorCount} failed)` 200 + ); 201 + 202 + // Only show time estimates if not cancelled 203 + if (!isImportCancelled()) { 204 + console.log(` ⏱ Elapsed: ${elapsed} | Remaining: ~${remaining}\n`); 205 + } 206 + 207 + // Check again before waiting 208 + if (isImportCancelled()) { 209 + return handleCancellation(successCount, errorCount, totalRecords); 210 + } 211 + 212 + // Wait before next batch (except for last batch) 213 + if (i + batchSize < records.length) { 214 + await new Promise((resolve) => setTimeout(resolve, batchDelay)); 215 + } 216 + } 217 + 218 + return { successCount, errorCount, cancelled: false }; 219 + } 220 + 221 + /** 222 + * Handle dry run mode 223 + */ 224 + function handleDryRun( 225 + records: PlayRecord[], 226 + batchSize: number, 227 + batchDelay: number, 228 + config: Config 229 + ): PublishResult { 230 + const totalRecords = records.length; 231 + 232 + // Calculate rate limiting info 233 + const rateLimitParams = calculateRateLimitedBatches(totalRecords, config); 234 + 235 + if (rateLimitParams.needsRateLimiting) { 236 + displayRateLimitWarning(); 237 + batchSize = rateLimitParams.batchSize; 238 + batchDelay = rateLimitParams.batchDelay; 239 + 240 + displayRateLimitInfo( 241 + totalRecords, 242 + batchSize, 243 + batchDelay, 244 + rateLimitParams.estimatedDays, 245 + rateLimitParams.recordsPerDay 246 + ); 247 + 248 + if (rateLimitParams.estimatedDays > 1) { 249 + const dailySchedule = calculateDailySchedule( 250 + totalRecords, 251 + batchSize, 252 + batchDelay, 253 + rateLimitParams.recordsPerDay 254 + ); 255 + 256 + console.log('📅 Multi-Day Import Schedule:\n'); 257 + dailySchedule.forEach((day) => { 258 + console.log(` Day ${day.day}:`); 259 + console.log(` Records ${day.recordsStart + 1}-${day.recordsEnd} (${day.recordsCount} total)`); 260 + if (day.pauseAfter) { 261 + console.log(` → Pause 24h after completion`); 262 + } 263 + }); 264 + console.log(''); 265 + } 266 + } 267 + 268 + console.log(`\n=== DRY RUN MODE ===`); 269 + console.log(`Would publish ${totalRecords} records in batches of ${batchSize}`); 270 + 271 + if (rateLimitParams.estimatedDays > 1) { 272 + console.log( 273 + `Import would span ${rateLimitParams.estimatedDays} days with automatic pauses\n` 274 + ); 275 + } else { 276 + console.log(`Estimated time: ${formatDuration(Math.ceil(totalRecords / batchSize) * batchDelay)}\n`); 277 + } 278 + 279 + // Show first 5 records as preview 280 + const previewCount = Math.min(5, totalRecords); 281 + console.log(`Preview of first ${previewCount} records (in processing order):\n`); 282 + 283 + for (let i = 0; i < previewCount; i++) { 284 + const record = records[i]; 285 + console.log(`${i + 1}. ${record.artists[0]?.artistName} - ${record.trackName}`); 286 + console.log(` Album: ${record.releaseName || 'N/A'}`); 287 + console.log(` Played: ${record.playedTime}`); 288 + console.log(` URL: ${record.originUrl}`); 289 + 290 + // Show MusicBrainz IDs if available 291 + const mbids = []; 292 + if (record.artists[0]?.artistMbId) 293 + mbids.push(`Artist: ${record.artists[0].artistMbId}`); 294 + if (record.recordingMbId) mbids.push(`Recording: ${record.recordingMbId}`); 295 + if (record.releaseMbId) mbids.push(`Release: ${record.releaseMbId}`); 296 + 297 + if (mbids.length > 0) { 298 + console.log(` MBIDs: ${mbids.join(', ')}`); 299 + } 300 + console.log(''); 301 + } 302 + 303 + if (totalRecords > previewCount) { 304 + console.log(`... and ${totalRecords - previewCount} more records\n`); 305 + } 306 + 307 + console.log('=== DRY RUN COMPLETE ==='); 308 + console.log('No records were actually published.'); 309 + console.log('Remove --dry-run flag to publish for real.\n'); 310 + 311 + return { successCount: totalRecords, errorCount: 0, cancelled: false }; 312 + } 313 + 314 + /** 315 + * Handle cancellation 316 + */ 317 + function handleCancellation( 318 + successCount: number, 319 + errorCount: number, 320 + totalRecords: number 321 + ): PublishResult { 322 + console.log(`\n🛑 Import cancelled by user`); 323 + console.log(` Processed: ${successCount}/${totalRecords} records`); 324 + console.log(` Remaining: ${totalRecords - successCount} records\n`); 325 + return { successCount, errorCount, cancelled: true }; 326 + }
+71
src/types.ts
··· 1 + import { AtpAgent as Agent } from '@atproto/api'; 2 + 3 + /** 4 + * Type alias for the ATProto Agent, used for clarity in the project. 5 + */ 6 + export type AtpAgent = Agent; 7 + 8 + export interface LastFmCsvRecord { 9 + artist: string; 10 + track: string; 11 + album: string; 12 + uts: string; 13 + artist_mbid?: string; 14 + album_mbid?: string; 15 + track_mbid?: string; 16 + } 17 + 18 + export interface PlayRecordArtist { 19 + artistName: string; 20 + artistMbId?: string; 21 + } 22 + 23 + export interface PlayRecord { 24 + $type: string; 25 + trackName: string; 26 + artists: PlayRecordArtist[]; 27 + playedTime: string; 28 + submissionClientAgent: string; 29 + musicServiceBaseDomain: string; 30 + releaseName?: string; 31 + releaseMbId?: string; 32 + recordingMbId?: string; 33 + originUrl: string; 34 + } 35 + 36 + export interface CommandLineArgs { 37 + help?: boolean; 38 + file?: string; 39 + identifier?: string; 40 + password?: string; 41 + 'batch-size'?: string; 42 + 'batch-delay'?: string; 43 + yes?: boolean; 44 + 'dry-run'?: boolean; 45 + 'reverse-chronological'?: boolean; 46 + } 47 + 48 + export interface PublishResult { 49 + successCount: number; 50 + errorCount: number; 51 + cancelled: boolean; 52 + } 53 + 54 + export interface Config { 55 + MIN_RECORDS_FOR_SCALING: number; 56 + BASE_BATCH_SIZE: number; 57 + MAX_BATCH_SIZE: number; 58 + SCALING_FACTOR: number; 59 + DEFAULT_BATCH_DELAY: number; 60 + 61 + CLIENT_AGENT: string; 62 + 63 + DEFAULT_BATCH_SIZE: number; // from rate limiter 64 + MIN_BATCH_DELAY: number; // from rate limiter 65 + RECORDS_PER_DAY_LIMIT: number; 66 + SAFETY_MARGIN: number; 67 + 68 + SLINGSHOT_RESOLVER: string; 69 + 70 + RECORD_TYPE: string; 71 + }
-63
src/utils/helpers.js
··· 1 - /** 2 - * Utility functions for the Last.fm importer 3 - */ 4 - 5 - /** 6 - * Format duration in human-readable format 7 - */ 8 - export function formatDuration(milliseconds) { 9 - const seconds = Math.floor(milliseconds / 1000); 10 - const minutes = Math.floor(seconds / 60); 11 - const hours = Math.floor(minutes / 60); 12 - 13 - if (hours > 0) { 14 - const mins = minutes % 60; 15 - return `${hours}h ${mins}m`; 16 - } else if (minutes > 0) { 17 - const secs = seconds % 60; 18 - return `${minutes}m ${secs}s`; 19 - } else { 20 - return `${seconds}s`; 21 - } 22 - } 23 - 24 - /** 25 - * Calculate optimal batch size based on total records and rate limits 26 - * Uses a logarithmic scaling approach to balance throughput with API safety 27 - */ 28 - export function calculateOptimalBatchSize(totalRecords, batchDelay, config) { 29 - const { 30 - MIN_RECORDS_FOR_SCALING, 31 - BASE_BATCH_SIZE, 32 - MAX_BATCH_SIZE, 33 - SCALING_FACTOR, 34 - DEFAULT_BATCH_DELAY 35 - } = config; 36 - 37 - const delay = batchDelay || DEFAULT_BATCH_DELAY; 38 - 39 - // For very small datasets, use minimal batches 40 - if (totalRecords <= 50) { 41 - return 3; 42 - } 43 - 44 - // For small to medium datasets, use conservative batching 45 - if (totalRecords <= MIN_RECORDS_FOR_SCALING) { 46 - return BASE_BATCH_SIZE; 47 - } 48 - 49 - // Logarithmic scaling 50 - const logScale = Math.log2(totalRecords / MIN_RECORDS_FOR_SCALING); 51 - const calculatedSize = Math.floor(BASE_BATCH_SIZE + (logScale * SCALING_FACTOR)); 52 - 53 - // Apply maximum cap 54 - let optimalSize = Math.min(calculatedSize, MAX_BATCH_SIZE); 55 - 56 - // Adjust based on batch delay 57 - if (delay < 1500 && optimalSize > 15) { 58 - optimalSize = Math.floor(optimalSize * 0.75); 59 - } 60 - 61 - // Ensure batch size is at least 3 62 - return Math.max(3, optimalSize); 63 - }
+88
src/utils/helpers.ts
··· 1 + /** 2 + * Utility functions for the Last.fm importer 3 + */ 4 + import type { Config } from '../types.js'; 5 + 6 + /** 7 + * Format duration in human-readable format 8 + */ 9 + export function formatDuration(milliseconds: number): string { 10 + const seconds = Math.floor(milliseconds / 1000); 11 + const minutes = Math.floor(seconds / 60); 12 + const hours = Math.floor(minutes / 60); 13 + 14 + if (hours > 0) { 15 + const mins = minutes % 60; 16 + return `${hours}h ${mins}m`; 17 + } else if (minutes > 0) { 18 + const secs = seconds % 60; 19 + return `${minutes}m ${secs}s`; 20 + } else { 21 + return `${seconds}s`; 22 + } 23 + } 24 + 25 + /** 26 + * Calculate optimal batch size based on total records and rate limits 27 + * Uses a logarithmic scaling approach to balance throughput with API safety 28 + */ 29 + export function calculateOptimalBatchSize(totalRecords: number, batchDelay: number, config: Config): number { 30 + const { 31 + MIN_RECORDS_FOR_SCALING, 32 + BASE_BATCH_SIZE, 33 + MAX_BATCH_SIZE, 34 + SCALING_FACTOR, 35 + DEFAULT_BATCH_DELAY 36 + } = config; 37 + 38 + const delay = batchDelay || DEFAULT_BATCH_DELAY; 39 + 40 + // For very small datasets, use minimal batches 41 + if (totalRecords <= 50) { 42 + return 3; 43 + } 44 + 45 + // For small to medium datasets, use conservative batching 46 + if (totalRecords <= MIN_RECORDS_FOR_SCALING) { 47 + return BASE_BATCH_SIZE; 48 + } 49 + 50 + // Logarithmic scaling 51 + const logScale = Math.log2(totalRecords / MIN_RECORDS_FOR_SCALING); 52 + const calculatedSize = Math.floor(BASE_BATCH_SIZE + (logScale * SCALING_FACTOR)); 53 + 54 + // Apply maximum cap 55 + let optimalSize = Math.min(calculatedSize, MAX_BATCH_SIZE); 56 + 57 + // Adjust based on batch delay 58 + if (delay < 1500 && optimalSize > 15) { 59 + optimalSize = Math.floor(optimalSize * 0.75); 60 + } 61 + 62 + // Ensure batch size is at least 3 63 + return Math.max(3, optimalSize); 64 + } 65 + 66 + /** 67 + * Logs rate limiting and batching information to the console. 68 + */ 69 + export function showRateLimitInfo( 70 + totalRecords: number, 71 + batchSize: number, 72 + batchDelay: number, 73 + estimatedDays: number, 74 + dailyLimit: number 75 + ): void { 76 + console.log('\n📊 Rate Limiting Information:'); 77 + console.log(` Total records: ${totalRecords.toLocaleString()}`); 78 + console.log(` Daily limit: ${dailyLimit.toLocaleString()} records/day`); 79 + console.log(` Estimated duration: ${estimatedDays} day${estimatedDays > 1 ? 's' : ''}`); 80 + console.log(` Batch size: ${batchSize} records`); 81 + console.log(` Batch delay: ${(batchDelay / 1000).toFixed(1)}s`); 82 + 83 + if (estimatedDays > 1) { 84 + console.log('\n The import will automatically pause between days.'); 85 + console.log(' You can safely close and restart the importer - it will resume from where it left off.'); 86 + } 87 + console.log(''); 88 + }
+12 -12
src/utils/input.js src/utils/input.ts
··· 3 3 /** 4 4 * Read user input from command line with proper password masking 5 5 */ 6 - export function prompt(question, hideInput = false) { 6 + export function prompt(question: string, hideInput = false): Promise<string> { 7 7 return new Promise((resolve) => { 8 8 if (hideInput) { 9 9 // For password input, use raw mode 10 10 const stdin = process.stdin; 11 11 const wasRaw = stdin.isRaw; 12 - 12 + 13 13 // Set raw mode to capture individual keystrokes 14 14 if (stdin.isTTY) { 15 15 stdin.setRawMode(true); 16 16 } 17 - 17 + 18 18 stdin.resume(); 19 19 stdin.setEncoding('utf8'); 20 - 20 + 21 21 process.stdout.write(question); 22 - 22 + 23 23 let password = ''; 24 - const onData = (char) => { 25 - char = char.toString(); 26 - 27 - switch (char) { 24 + const onData = (char: Buffer | string) => { 25 + const charStr = char.toString(); 26 + 27 + switch (charStr) { 28 28 case '\n': 29 29 case '\r': 30 30 case '\u0004': // Ctrl-D ··· 49 49 } 50 50 break; 51 51 default: 52 - password += char; 52 + password += charStr; 53 53 process.stdout.write('*'); 54 54 break; 55 55 } 56 56 }; 57 - 57 + 58 58 stdin.on('data', onData); 59 59 } else { 60 60 const rl = readline.createInterface({ 61 61 input: process.stdin, 62 62 output: process.stdout, 63 63 }); 64 - 64 + 65 65 rl.question(question, (answer) => { 66 66 rl.close(); 67 67 resolve(answer);
-35
src/utils/killswitch.js
··· 1 - // Global state for killswitch 2 - let importCancelled = false; 3 - let gracefulShutdown = false; 4 - 5 - /** 6 - * Setup killswitch handler for graceful shutdown 7 - */ 8 - export function setupKillswitch() { 9 - process.on('SIGINT', () => { 10 - if (gracefulShutdown) { 11 - console.log('\n\n⚠️ Force quit detected. Exiting immediately...'); 12 - process.exit(1); 13 - } 14 - 15 - gracefulShutdown = true; 16 - importCancelled = true; 17 - console.log('\n\n🛑 Killswitch activated! Stopping after current batch...'); 18 - console.log(' Press Ctrl+C again to force quit immediately.\n'); 19 - }); 20 - } 21 - 22 - /** 23 - * Check if import has been cancelled 24 - */ 25 - export function isImportCancelled() { 26 - return importCancelled; 27 - } 28 - 29 - /** 30 - * Reset killswitch state (useful for testing) 31 - */ 32 - export function resetKillswitch() { 33 - importCancelled = false; 34 - gracefulShutdown = false; 35 - }
+22
src/utils/killswitch.ts
··· 1 + let cancelled = false; 2 + 3 + // Flip the killswitch when the user hits CTRL-C 4 + process.on('SIGINT', () => { 5 + console.log('\nCaught CTRL-C — stopping import…'); 6 + cancelled = true; 7 + }); 8 + 9 + /** 10 + * Manually cancel the import if needed. 11 + */ 12 + export function cancelImport() { 13 + cancelled = true; 14 + } 15 + 16 + /** 17 + * Check whether the import should stop. 18 + * Call this inside loops, batch processors, etc. 19 + */ 20 + export function isImportCancelled(): boolean { 21 + return cancelled; 22 + }
+166
src/utils/rate-limiter.ts
··· 1 + import type { Config } from '../types.js'; 2 + 3 + /** 4 + * Calculate rate-limited batch parameters 5 + * Ensures we don't exceed daily limits while maintaining efficiency 6 + */ 7 + export function calculateRateLimitedBatches( 8 + totalRecords: number, 9 + config: Config 10 + ): { 11 + batchSize: number; 12 + batchDelay: number; 13 + estimatedDays: number; 14 + recordsPerDay: number; 15 + needsRateLimiting: boolean; 16 + } { 17 + const dailyLimit = Math.floor(config.RECORDS_PER_DAY_LIMIT * config.SAFETY_MARGIN); 18 + 19 + // Check if we need rate limiting 20 + const needsRateLimiting = totalRecords > dailyLimit; 21 + 22 + if (!needsRateLimiting) { 23 + // Can import everything in one go 24 + return { 25 + batchSize: config.DEFAULT_BATCH_SIZE, 26 + batchDelay: config.DEFAULT_BATCH_DELAY, 27 + estimatedDays: 1, 28 + recordsPerDay: totalRecords, 29 + needsRateLimiting: false, 30 + }; 31 + } 32 + 33 + // Calculate how many days needed 34 + const estimatedDays = Math.ceil(totalRecords / dailyLimit); 35 + const recordsPerDay = Math.floor(totalRecords / estimatedDays); 36 + 37 + // Calculate batch parameters 38 + // We want to spread records evenly throughout the day 39 + const minutesPerDay = 24 * 60; 40 + const batchesPerDay = Math.ceil(recordsPerDay / config.DEFAULT_BATCH_SIZE); 41 + const delayBetweenBatches = Math.floor((minutesPerDay * 60 * 1000) / batchesPerDay); 42 + 43 + // Ensure batch delay is at least minimum 44 + const batchDelay = Math.max(delayBetweenBatches, config.MIN_BATCH_DELAY); 45 + 46 + // Adjust batch size if needed to hit the target 47 + const adjustedBatchSize = Math.min( 48 + Math.ceil(recordsPerDay / Math.floor((minutesPerDay * 60 * 1000) / batchDelay)), 49 + config.MAX_BATCH_SIZE 50 + ); 51 + 52 + return { 53 + batchSize: adjustedBatchSize, 54 + batchDelay, 55 + estimatedDays, 56 + recordsPerDay, 57 + needsRateLimiting: true, 58 + }; 59 + } 60 + 61 + /** 62 + * Calculate daily batches and pause times 63 + */ 64 + export function calculateDailySchedule( 65 + totalRecords: number, 66 + batchSize: number, 67 + batchDelay: number, 68 + recordsPerDay: number 69 + ) { 70 + const schedule = []; 71 + 72 + // How many batches fit into a 24h window using the actual delay? 73 + const batchesPerDay = Math.floor((24 * 60 * 60 * 1000) / batchDelay); 74 + 75 + // Max records we could process in one day given the spacing 76 + const maxRecordsPerDay = batchesPerDay * batchSize; 77 + 78 + // Respect the external rate limit (recordsPerDay) 79 + const dailyCap = Math.min(maxRecordsPerDay, recordsPerDay); 80 + 81 + let processed = 0; 82 + let day = 1; 83 + 84 + while (processed < totalRecords) { 85 + const recordsStart = processed; 86 + const dailyCount = Math.min(dailyCap, totalRecords - processed); 87 + const recordsEnd = recordsStart + dailyCount; 88 + const isLastDay = recordsEnd >= totalRecords; 89 + 90 + schedule.push({ 91 + day, 92 + recordsStart, 93 + recordsEnd, 94 + recordsCount: dailyCount, 95 + pauseAfter: !isLastDay, 96 + pauseDuration: isLastDay ? 0 : 24 * 60 * 60 * 1000 97 + }); 98 + 99 + processed = recordsEnd; 100 + day++; 101 + } 102 + 103 + return schedule; 104 + } 105 + 106 + 107 + /** 108 + * Format time duration in human-readable format 109 + */ 110 + export function formatTimeRemaining(ms: number): string { 111 + const days = Math.floor(ms / (24 * 60 * 60 * 1000)); 112 + const hours = Math.floor((ms % (24 * 60 * 60 * 1000)) / (60 * 60 * 1000)); 113 + const minutes = Math.floor((ms % (60 * 60 * 1000)) / (60 * 1000)); 114 + 115 + if (days > 0) { 116 + return `${days}d ${hours}h ${minutes}m`; 117 + } else if (hours > 0) { 118 + return `${hours}h ${minutes}m`; 119 + } else if (minutes > 0) { 120 + return `${minutes}m`; 121 + } else { 122 + return '< 1m'; 123 + } 124 + } 125 + 126 + /** 127 + * Display rate limit warning 128 + */ 129 + export function displayRateLimitWarning(): void { 130 + console.log('\n⚠️ ═══════════════════════════════════════════════════════════════════════════════'); 131 + console.log('⚠️ IMPORTANT: Bluesky AppView Rate Limits'); 132 + console.log('⚠️ ═══════════════════════════════════════════════════════════════════════════════'); 133 + console.log('⚠️'); 134 + console.log('⚠️ Exceeding 10K records per day can rate limit your ENTIRE PDS on Bluesky\'s'); 135 + console.log('⚠️ AppView. This affects ALL users on your PDS, not just your account!'); 136 + console.log('⚠️'); 137 + console.log('⚠️ This importer automatically limits imports to 1K records per day by default'); 138 + console.log('⚠️ with automatic batching and pauses to stay within safe limits.'); 139 + console.log('⚠️'); 140 + console.log('⚠️ See: https://docs.bsky.app/blog/rate-limits-pds-v3'); 141 + console.log('⚠️ ═══════════════════════════════════════════════════════════════════════════════\n'); 142 + } 143 + 144 + /** 145 + * Display rate limiting info 146 + */ 147 + export function displayRateLimitInfo( 148 + totalRecords: number, 149 + batchSize: number, 150 + batchDelay: number, 151 + estimatedDays: number, 152 + recordsPerDay: number 153 + ): void { 154 + console.log('\n📊 Rate Limiting Information:'); 155 + console.log(` Total records: ${totalRecords.toLocaleString()}`); 156 + console.log(` Daily limit: ${recordsPerDay.toLocaleString()} records/day`); 157 + console.log(` Estimated duration: ${estimatedDays} day${estimatedDays > 1 ? 's' : ''}`); 158 + console.log(` Batch size: ${batchSize} records`); 159 + console.log(` Batch delay: ${(batchDelay / 1000).toFixed(1)}s`); 160 + 161 + if (estimatedDays > 1) { 162 + console.log('\n The import will automatically pause between days.'); 163 + console.log(' You can safely close and restart the importer - it will resume from where it left off.'); 164 + } 165 + console.log(''); 166 + }
+27
tsconfig.json
··· 1 + { 2 + "compilerOptions": { 3 + "target": "ES2022", 4 + "module": "node16", 5 + "moduleResolution": "node16", 6 + "lib": ["ES2022"], 7 + "outDir": "./dist", 8 + "rootDir": "./src", 9 + "strict": true, 10 + "esModuleInterop": true, 11 + "skipLibCheck": true, 12 + "forceConsistentCasingInFileNames": true, 13 + "resolveJsonModule": true, 14 + "declaration": true, 15 + "declarationMap": true, 16 + "sourceMap": true, 17 + "noImplicitAny": true, 18 + "strictNullChecks": true, 19 + "strictFunctionTypes": true, 20 + "noUnusedLocals": true, 21 + "noUnusedParameters": true, 22 + "noImplicitReturns": true, 23 + "noFallthroughCasesInSwitch": true 24 + }, 25 + "include": ["src/**/*"], 26 + "exclude": ["node_modules", "dist"] 27 + }