comparing main and old-main on anil.recoil.org/thicket

-5

.gitignore

··· 203 203 .streamlit/secrets.toml 204 204 205 205 thicket.yaml 206 - 207 - # Bot configuration files with secrets 208 - bot-config/zuliprc 209 - bot-config/*.key 210 - bot-config/*.secret

-26

README.md

··· 9 9 - **Duplicate Management**: Manual curation of duplicate entries across feeds 10 10 - **Modern CLI**: Built with Typer and Rich for beautiful terminal output 11 11 - **Comprehensive Parsing**: Supports RSS 0.9x, RSS 1.0, RSS 2.0, and Atom feeds 12 - - **Zulip Bot Integration**: Automatically post new feed articles to Zulip chat 13 12 - **Cron-Friendly**: Designed for scheduled execution 14 13 15 14 ## Installation ··· 110 109 # Remove duplicate mapping 111 110 thicket duplicates remove "https://example.com/dup" 112 111 ``` 113 - 114 - ### Zulip Bot Integration 115 - ```bash 116 - # Test bot functionality 117 - thicket bot test 118 - 119 - # Show bot status 120 - thicket bot status 121 - 122 - # Run bot (requires configuration) 123 - thicket bot run --config bot-config/zuliprc 124 - ``` 125 - 126 - **Bot Setup:** 127 - 1. Create a Zulip bot in your organization 128 - 2. Copy `bot-config/zuliprc.template` to `bot-config/zuliprc` 129 - 3. Configure with your bot's credentials 130 - 4. Run the bot and configure via Zulip chat: 131 - ``` 132 - @thicket config path /path/to/thicket.yaml 133 - @thicket config stream general 134 - @thicket config topic "Feed Updates" 135 - ``` 136 - 137 - See [docs/ZULIP_BOT.md](docs/ZULIP_BOT.md) for detailed setup instructions. 138 112 139 113 ## Configuration 140 114

-400

SPEC.md

··· 1 - # Thicket Git Store Specification 2 - 3 - This document comprehensively defines the JSON format and structure of the Thicket Git repository, enabling third-party clients to read and write to the store while leveraging Thicket's existing Python classes for data validation and business logic. 4 - 5 - ## Overview 6 - 7 - The Thicket Git store is a structured repository that persists Atom/RSS feed entries in JSON format. The store is designed to be both human-readable and machine-parseable, with a clear directory structure and standardized JSON schemas. 8 - 9 - ## Repository Structure 10 - 11 - ``` 12 - <git_store>/ 13 - ├── index.json # Main index of all users and metadata 14 - ├── duplicates.json # Maps duplicate entry IDs to canonical IDs 15 - ├── index.opml # OPML export of all feeds (generated) 16 - ├── <username1>/ # User directory (sanitized username) 17 - │ ├── <entry_id1>.json # Individual feed entry 18 - │ ├── <entry_id2>.json # Individual feed entry 19 - │ └── ... 20 - ├── <username2>/ 21 - │ ├── <entry_id3>.json 22 - │ └── ... 23 - └── ... 24 - ``` 25 - 26 - ## JSON Schemas 27 - 28 - ### 1. Index File (`index.json`) 29 - 30 - The main index tracks all users, their metadata, and repository statistics. 31 - 32 - **Schema:** 33 - ```json 34 - { 35 - "users": { 36 - "<username>": { 37 - "username": "string", 38 - "display_name": "string | null", 39 - "email": "string | null", 40 - "homepage": "string (URL) | null", 41 - "icon": "string (URL) | null", 42 - "feeds": ["string (URL)", ...], 43 - "zulip_associations": [ 44 - { 45 - "server": "string", 46 - "user_id": "string" 47 - }, 48 - ... 49 - ], 50 - "directory": "string", 51 - "created": "string (ISO 8601 datetime)", 52 - "last_updated": "string (ISO 8601 datetime)", 53 - "entry_count": "integer" 54 - } 55 - }, 56 - "created": "string (ISO 8601 datetime)", 57 - "last_updated": "string (ISO 8601 datetime)", 58 - "total_entries": "integer" 59 - } 60 - ``` 61 - 62 - **Example:** 63 - ```json 64 - { 65 - "users": { 66 - "johndoe": { 67 - "username": "johndoe", 68 - "display_name": "John Doe", 69 - "email": "john@example.com", 70 - "homepage": "https://johndoe.blog", 71 - "icon": "https://johndoe.blog/avatar.png", 72 - "feeds": [ 73 - "https://johndoe.blog/feed.xml", 74 - "https://johndoe.blog/categories/tech/feed.xml" 75 - ], 76 - "zulip_associations": [ 77 - { 78 - "server": "myorg.zulipchat.com", 79 - "user_id": "john.doe" 80 - }, 81 - { 82 - "server": "community.zulipchat.com", 83 - "user_id": "johndoe@example.com" 84 - } 85 - ], 86 - "directory": "johndoe", 87 - "created": "2024-01-15T10:30:00", 88 - "last_updated": "2024-01-20T14:22:00", 89 - "entry_count": 42 90 - } 91 - }, 92 - "created": "2024-01-15T10:30:00", 93 - "last_updated": "2024-01-20T14:22:00", 94 - "total_entries": 42 95 - } 96 - ``` 97 - 98 - ### 2. Duplicates File (`duplicates.json`) 99 - 100 - Maps duplicate entry IDs to their canonical representations to handle feed entries that appear with different IDs but identical content. 101 - 102 - **Schema:** 103 - ```json 104 - { 105 - "duplicates": { 106 - "<duplicate_id>": "<canonical_id>" 107 - }, 108 - "comment": "Entry IDs that map to the same canonical content" 109 - } 110 - ``` 111 - 112 - **Example:** 113 - ```json 114 - { 115 - "duplicates": { 116 - "https://example.com/posts/123?utm_source=rss": "https://example.com/posts/123", 117 - "https://example.com/feed/item-duplicate": "https://example.com/feed/item-original" 118 - }, 119 - "comment": "Entry IDs that map to the same canonical content" 120 - } 121 - ``` 122 - 123 - ### 3. Feed Entry Files (`<username>/<entry_id>.json`) 124 - 125 - Individual feed entries are stored as normalized Atom entries, regardless of their original format (RSS/Atom). 126 - 127 - **Schema:** 128 - ```json 129 - { 130 - "id": "string", 131 - "title": "string", 132 - "link": "string (URL)", 133 - "updated": "string (ISO 8601 datetime)", 134 - "published": "string (ISO 8601 datetime) | null", 135 - "summary": "string | null", 136 - "content": "string | null", 137 - "content_type": "html | text | xhtml", 138 - "author": { 139 - "name": "string | null", 140 - "email": "string | null", 141 - "uri": "string (URL) | null" 142 - } | null, 143 - "categories": ["string", ...], 144 - "rights": "string | null", 145 - "source": "string (URL) | null" 146 - } 147 - ``` 148 - 149 - **Example:** 150 - ```json 151 - { 152 - "id": "https://johndoe.blog/posts/my-first-post", 153 - "title": "My First Blog Post", 154 - "link": "https://johndoe.blog/posts/my-first-post", 155 - "updated": "2024-01-20T14:22:00", 156 - "published": "2024-01-20T09:00:00", 157 - "summary": "This is a summary of my first blog post.", 158 - "content": "<p>This is the full content of my <strong>first</strong> blog post with HTML formatting.</p>", 159 - "content_type": "html", 160 - "author": { 161 - "name": "John Doe", 162 - "email": "john@example.com", 163 - "uri": "https://johndoe.blog" 164 - }, 165 - "categories": ["blogging", "personal"], 166 - "rights": "Copyright 2024 John Doe", 167 - "source": "https://johndoe.blog/feed.xml" 168 - } 169 - ``` 170 - 171 - ## Python Class Integration 172 - 173 - To leverage Thicket's existing validation and business logic, third-party clients should use the following Python classes from the `thicket.models` package: 174 - 175 - ### Core Data Models 176 - 177 - ```python 178 - from thicket.models import ( 179 - AtomEntry, # Feed entry representation 180 - GitStoreIndex, # Repository index 181 - UserMetadata, # User information 182 - DuplicateMap, # Duplicate ID mappings 183 - FeedMetadata, # Feed-level metadata 184 - ThicketConfig, # Configuration 185 - UserConfig, # User configuration 186 - ZulipAssociation # Zulip server/user_id pairs 187 - ) 188 - ``` 189 - 190 - ### Repository Operations 191 - 192 - ```python 193 - from thicket.core.git_store import GitStore 194 - from thicket.core.feed_parser import FeedParser 195 - 196 - # Initialize git store 197 - store = GitStore(Path("/path/to/git/store")) 198 - 199 - # Read data 200 - index = store._load_index() # Load index.json 201 - user = store.get_user("username") # Get user metadata 202 - entries = store.list_entries("username", limit=10) 203 - entry = store.get_entry("username", "entry_id") 204 - duplicates = store.get_duplicates() # Load duplicates.json 205 - 206 - # Write data 207 - store.add_user("username", display_name="Display Name") 208 - store.store_entry("username", atom_entry) 209 - store.add_duplicate("duplicate_id", "canonical_id") 210 - store.commit_changes("Commit message") 211 - 212 - # Zulip associations 213 - store.add_zulip_association("username", "myorg.zulipchat.com", "user@example.com") 214 - store.remove_zulip_association("username", "myorg.zulipchat.com", "user@example.com") 215 - associations = store.get_zulip_associations("username") 216 - 217 - # Search and statistics 218 - results = store.search_entries("query", username="optional") 219 - stats = store.get_stats() 220 - ``` 221 - 222 - ### Feed Processing 223 - 224 - ```python 225 - from thicket.core.feed_parser import FeedParser 226 - from pydantic import HttpUrl 227 - 228 - parser = FeedParser() 229 - 230 - # Fetch and parse feeds 231 - content = await parser.fetch_feed(HttpUrl("https://example.com/feed.xml")) 232 - feed_metadata, entries = parser.parse_feed(content, source_url) 233 - 234 - # Entry ID sanitization for filenames 235 - safe_filename = parser.sanitize_entry_id(entry.id) 236 - ``` 237 - 238 - ## File Naming and ID Sanitization 239 - 240 - Entry IDs from feeds are sanitized to create safe filenames using `FeedParser.sanitize_entry_id()`: 241 - 242 - - URLs are parsed and the path component is used as the base 243 - - Characters are limited to alphanumeric, hyphens, underscores, and periods 244 - - Other characters are replaced with underscores 245 - - Maximum length is 200 characters 246 - - Empty results default to "entry" 247 - 248 - **Examples:** 249 - - `https://example.com/posts/my-post` → `posts_my-post.json` 250 - - `https://blog.com/2024/01/title?utm=source` → `2024_01_title.json` 251 - 252 - ## Data Validation 253 - 254 - All JSON data should be validated using Pydantic models before writing to the store: 255 - 256 - ```python 257 - from thicket.models import AtomEntry 258 - from pydantic import ValidationError 259 - 260 - try: 261 - entry = AtomEntry(**json_data) 262 - # Data is valid, safe to store 263 - store.store_entry(username, entry) 264 - except ValidationError as e: 265 - # Handle validation errors 266 - print(f"Invalid entry data: {e}") 267 - ``` 268 - 269 - ## Timestamps 270 - 271 - All timestamps use ISO 8601 format in UTC: 272 - - `created`: When the record was first created 273 - - `last_updated`: When the record was last modified 274 - - `updated`: When the feed entry was last updated (from feed) 275 - - `published`: When the feed entry was originally published (from feed) 276 - 277 - ## Content Sanitization 278 - 279 - HTML content in entries is sanitized using the `FeedParser._sanitize_html()` method to prevent XSS attacks. Allowed tags and attributes are strictly controlled. 280 - 281 - **Allowed HTML tags:** 282 - `a`, `abbr`, `acronym`, `b`, `blockquote`, `br`, `code`, `em`, `i`, `li`, `ol`, `p`, `pre`, `strong`, `ul`, `h1`-`h6`, `img`, `div`, `span` 283 - 284 - **Allowed attributes:** 285 - - `a`: `href`, `title` 286 - - `img`: `src`, `alt`, `title`, `width`, `height` 287 - - `blockquote`: `cite` 288 - - `abbr`/`acronym`: `title` 289 - 290 - ## Error Handling and Robustness 291 - 292 - The store is designed to be fault-tolerant: 293 - 294 - - Invalid entries are skipped during processing with error logging 295 - - Malformed JSON files are ignored in listings 296 - - Missing files return `None` rather than raising exceptions 297 - - Git operations are atomic where possible 298 - 299 - ## Example Usage 300 - 301 - ### Reading the Store 302 - 303 - ```python 304 - from pathlib import Path 305 - from thicket.core.git_store import GitStore 306 - 307 - # Initialize 308 - store = GitStore(Path("/path/to/thicket/store")) 309 - 310 - # Get all users 311 - index = store._load_index() 312 - for username, user_metadata in index.users.items(): 313 - print(f"User: {user_metadata.display_name} ({username})") 314 - print(f" Feeds: {user_metadata.feeds}") 315 - print(f" Entries: {user_metadata.entry_count}") 316 - 317 - # Get recent entries for a user 318 - entries = store.list_entries("johndoe", limit=5) 319 - for entry in entries: 320 - print(f" - {entry.title} ({entry.updated})") 321 - ``` 322 - 323 - ### Adding Data 324 - 325 - ```python 326 - from thicket.models import AtomEntry 327 - from datetime import datetime 328 - from pydantic import HttpUrl 329 - 330 - # Create entry 331 - entry = AtomEntry( 332 - id="https://example.com/new-post", 333 - title="New Post", 334 - link=HttpUrl("https://example.com/new-post"), 335 - updated=datetime.now(), 336 - content="<p>Post content</p>", 337 - content_type="html" 338 - ) 339 - 340 - # Store entry 341 - store.store_entry("johndoe", entry) 342 - store.commit_changes("Add new blog post") 343 - ``` 344 - 345 - ## Zulip Integration 346 - 347 - The Thicket Git store supports Zulip bot integration for automatic feed posting with user mentions. 348 - 349 - ### Zulip Associations 350 - 351 - Users can be associated with their Zulip identities to enable @mentions: 352 - 353 - ```python 354 - # UserMetadata includes zulip_associations field 355 - user.zulip_associations = [ 356 - ZulipAssociation(server="myorg.zulipchat.com", user_id="alice"), 357 - ZulipAssociation(server="other.zulipchat.com", user_id="alice@example.com") 358 - ] 359 - 360 - # Methods for managing associations 361 - user.add_zulip_association("myorg.zulipchat.com", "alice") 362 - user.get_zulip_mention("myorg.zulipchat.com") # Returns "alice" 363 - user.remove_zulip_association("myorg.zulipchat.com", "alice") 364 - ``` 365 - 366 - ### CLI Management 367 - 368 - ```bash 369 - # Add association 370 - thicket zulip-add alice myorg.zulipchat.com alice@example.com 371 - 372 - # Remove association 373 - thicket zulip-remove alice myorg.zulipchat.com alice@example.com 374 - 375 - # List associations 376 - thicket zulip-list # All users 377 - thicket zulip-list alice # Specific user 378 - 379 - # Bulk import from CSV 380 - thicket zulip-import associations.csv 381 - ``` 382 - 383 - ### Bot Behavior 384 - 385 - When the Thicket Zulip bot posts articles: 386 - 387 - 1. It checks for Zulip associations matching the current server 388 - 2. If found, adds @mention to the post: `@**alice** posted:` 389 - 3. The mentioned user receives a notification in Zulip 390 - 391 - This enables automatic notifications when someone's blog post is shared. 392 - 393 - ## Versioning and Compatibility 394 - 395 - This specification describes version 1.1 of the Thicket Git store format. Changes from 1.0: 396 - - Added `zulip_associations` field to UserMetadata (backwards compatible - defaults to empty list) 397 - 398 - Future versions will maintain backward compatibility where possible, with migration tools provided for breaking changes. 399 - 400 - To check the store format version, examine the repository structure and JSON schemas. Stores created by Thicket 0.1.0+ follow this specification.

-97

bot-config/README.md

··· 1 - # Thicket Bot Configuration 2 - 3 - This directory contains configuration files for the Thicket Zulip bot. 4 - 5 - ## Setup Instructions 6 - 7 - ### 1. Zulip Bot Configuration 8 - 9 - 1. Copy `zuliprc.template` to `zuliprc`: 10 - ```bash 11 - cp bot-config/zuliprc.template bot-config/zuliprc 12 - ``` 13 - 14 - 2. Create a bot in your Zulip organization: 15 - - Go to Settings > Your bots > Add a new bot 16 - - Choose "Generic bot" type 17 - - Give it a name like "Thicket" and username like "thicket" 18 - - Copy the bot's email and API key 19 - 20 - 3. Edit `bot-config/zuliprc` with your bot's credentials: 21 - ```ini 22 - [api] 23 - email=thicket-bot@your-org.zulipchat.com 24 - key=your-actual-api-key-here 25 - site=https://your-org.zulipchat.com 26 - ``` 27 - 28 - ### 2. Bot Behavior Configuration (Optional) 29 - 30 - 1. Copy `botrc.template` to `botrc` to customize bot behavior: 31 - ```bash 32 - cp bot-config/botrc.template bot-config/botrc 33 - ``` 34 - 35 - 2. Edit `bot-config/botrc` to customize: 36 - - Sync intervals and batch sizes 37 - - Default stream/topic settings 38 - - Rate limiting parameters 39 - - Notification preferences 40 - 41 - **Note**: The bot will work with default settings if no `botrc` file exists. 42 - 43 - ## File Descriptions 44 - 45 - ### `zuliprc` (Required) 46 - Contains Zulip API credentials for the bot. This file should **never** be committed to version control. 47 - 48 - ### `botrc` (Optional) 49 - Contains bot behavior configuration and defaults. This file can be committed to version control as it contains no secrets. 50 - 51 - ### Template Files 52 - - `zuliprc.template` - Template for Zulip credentials 53 - - `botrc.template` - Template for bot behavior settings 54 - 55 - ## Running the Bot 56 - 57 - Once configured, run the bot with: 58 - 59 - ```bash 60 - # Run in foreground 61 - thicket bot run 62 - 63 - # Run in background (daemon mode) 64 - thicket bot run --daemon 65 - 66 - # Debug mode (sends DMs instead of stream posts) 67 - thicket bot run --debug-user your-thicket-username 68 - 69 - # Custom config paths 70 - thicket bot run --config bot-config/zuliprc --botrc bot-config/botrc 71 - ``` 72 - 73 - ## Bot Commands 74 - 75 - Once running, interact with the bot in Zulip: 76 - 77 - - `@thicket help` - Show available commands 78 - - `@thicket status` - Show bot status and configuration 79 - - `@thicket sync now` - Force immediate sync 80 - - `@thicket schedule` - Show sync schedule 81 - - `@thicket claim <username>` - Claim a thicket username 82 - - `@thicket config <setting> <value>` - Change bot settings 83 - 84 - ## Security Notes 85 - 86 - - **Never commit `zuliprc` with real credentials** 87 - - Add `bot-config/zuliprc` to `.gitignore` 88 - - The `botrc` file contains no secrets and can be safely committed 89 - - Bot settings changed via chat are stored in Zulip's persistent storage 90 - 91 - ## Troubleshooting 92 - 93 - - Check bot status: `thicket bot status` 94 - - View bot logs when running in foreground mode 95 - - Verify Zulip credentials are correct 96 - - Ensure thicket.yaml configuration exists 97 - - Test bot functionality: `thicket bot test`

-28

bot-config/botrc

··· 1 - [bot] 2 - # Default RSS feed polling interval in seconds (minimum 60) 3 - sync_interval = 300 4 - 5 - # Maximum number of entries to post per sync cycle 6 - max_entries_per_sync = 10 7 - 8 - # Default stream and topic for posting (can be overridden via chat commands) 9 - # Leave empty to require configuration via chat 10 - default_stream = 11 - default_topic = 12 - 13 - # Rate limiting: seconds to wait between batches of posts 14 - rate_limit_delay = 5 15 - 16 - # Number of posts per batch before applying rate limit 17 - posts_per_batch = 5 18 - 19 - [catchup] 20 - # Number of entries to post on first run (catchup mode) 21 - catchup_entries = 5 22 - 23 - [notifications] 24 - # Whether to send notifications when bot configuration changes 25 - config_change_notifications = true 26 - 27 - # Whether to send notifications when users claim usernames 28 - username_claim_notifications = true

-34

bot-config/botrc.template

··· 1 - [bot] 2 - # Default RSS feed polling interval in seconds (minimum 60) 3 - sync_interval = 300 4 - 5 - # Maximum number of entries to post per sync cycle (1-50) 6 - max_entries_per_sync = 10 7 - 8 - # Default stream and topic for posting (can be overridden via chat commands) 9 - # Leave empty to require configuration via chat 10 - default_stream = 11 - default_topic = 12 - 13 - # Rate limiting: seconds to wait between batches of posts 14 - rate_limit_delay = 5 15 - 16 - # Number of posts per batch before applying rate limit 17 - posts_per_batch = 5 18 - 19 - [catchup] 20 - # Number of entries to post on first run (catchup mode) 21 - catchup_entries = 5 22 - 23 - [notifications] 24 - # Whether to send notifications when bot configuration changes 25 - config_change_notifications = true 26 - 27 - # Whether to send notifications when users claim usernames 28 - username_claim_notifications = true 29 - 30 - # Instructions: 31 - # 1. Copy this file to botrc (without .template extension) to customize bot behavior 32 - # 2. The bot will use these defaults if no botrc file is found 33 - # 3. All settings can be overridden via chat commands (e.g., @mention config interval 600) 34 - # 4. Settings changed via chat are persisted in Zulip storage and take precedence

-16

bot-config/zuliprc.template

··· 1 - [api] 2 - # Your bot's email address (create this in Zulip Settings > Bots) 3 - email=your-bot@your-organization.zulipchat.com 4 - 5 - # Your bot's API key (found in Zulip Settings > Bots) 6 - key=YOUR_BOT_API_KEY_HERE 7 - 8 - # Your Zulip server URL 9 - site=https://your-organization.zulipchat.com 10 - 11 - # Instructions: 12 - # 1. Copy this file to zuliprc (without .template extension) 13 - # 2. Replace the placeholder values with your actual bot credentials 14 - # 3. Create a bot in your Zulip organization at Settings > Bots 15 - # 4. Use the bot's email and API key from the Zulip interface 16 - # 5. Never commit the actual zuliprc file with real credentials to version control

+260

code_duplication_analysis.md

··· 1 + # Code Duplication Analysis for Thicket 2 + 3 + ## 1. Duplicate JSON Handling Code 4 + 5 + ### Pattern: JSON file reading/writing 6 + **Locations:** 7 + - `src/thicket/cli/commands/generate.py:230` - Reading JSON with `json.load(f)` 8 + - `src/thicket/cli/commands/generate.py:249` - Reading links.json 9 + - `src/thicket/cli/commands/index.py:2305` - Reading JSON 10 + - `src/thicket/cli/commands/index.py:2320` - Writing JSON with `json.dump()` 11 + - `src/thicket/cli/commands/threads.py:2456` - Reading JSON 12 + - `src/thicket/cli/commands/info.py:2683` - Reading JSON 13 + - `src/thicket/core/git_store.py:5546` - Writing JSON with custom serializer 14 + - `src/thicket/core/git_store.py:5556` - Reading JSON 15 + - `src/thicket/core/git_store.py:5566` - Writing JSON 16 + - `src/thicket/core/git_store.py:5656` - Writing JSON with model dump 17 + 18 + **Recommendation:** Create a shared `json_utils.py` module: 19 + ```python 20 + def read_json_file(path: Path) -> dict: 21 + """Read JSON file with error handling.""" 22 + with open(path) as f: 23 + return json.load(f) 24 + 25 + def write_json_file(path: Path, data: dict, indent: int = 2) -> None: 26 + """Write JSON file with consistent formatting.""" 27 + with open(path, "w") as f: 28 + json.dump(data, f, indent=indent, default=str) 29 + 30 + def write_model_json(path: Path, model: BaseModel, indent: int = 2) -> None: 31 + """Write Pydantic model as JSON.""" 32 + with open(path, "w") as f: 33 + json.dump(model.model_dump(mode="json", exclude_none=True), f, indent=indent, default=str) 34 + ``` 35 + 36 + ## 2. Repeated Datetime Handling 37 + 38 + ### Pattern: datetime formatting and fallback handling 39 + **Locations:** 40 + - `src/thicket/cli/commands/generate.py:241` - `key=lambda x: x[1].updated or x[1].published or datetime.min` 41 + - `src/thicket/cli/commands/generate.py:353` - Same pattern in thread sorting 42 + - `src/thicket/cli/commands/generate.py:359` - Same pattern for max date 43 + - `src/thicket/cli/commands/generate.py:625` - Same pattern 44 + - `src/thicket/cli/commands/generate.py:655` - `entry.updated or entry.published or datetime.min` 45 + - `src/thicket/cli/commands/generate.py:689` - Same pattern 46 + - `src/thicket/cli/commands/generate.py:702` - Same pattern 47 + - Multiple `.strftime('%Y-%m-%d')` calls throughout 48 + 49 + **Recommendation:** Create a shared `datetime_utils.py` module: 50 + ```python 51 + def get_entry_date(entry: AtomEntry) -> datetime: 52 + """Get the most relevant date for an entry with fallback.""" 53 + return entry.updated or entry.published or datetime.min 54 + 55 + def format_date_short(dt: datetime) -> str: 56 + """Format datetime as YYYY-MM-DD.""" 57 + return dt.strftime('%Y-%m-%d') 58 + 59 + def format_date_full(dt: datetime) -> str: 60 + """Format datetime as YYYY-MM-DD HH:MM.""" 61 + return dt.strftime('%Y-%m-%d %H:%M') 62 + 63 + def format_date_iso(dt: datetime) -> str: 64 + """Format datetime as ISO string.""" 65 + return dt.isoformat() 66 + ``` 67 + 68 + ## 3. Path Handling Patterns 69 + 70 + ### Pattern: Directory creation and existence checks 71 + **Locations:** 72 + - `src/thicket/cli/commands/generate.py:225` - `if user_dir.exists()` 73 + - `src/thicket/cli/commands/generate.py:247` - `if links_file.exists()` 74 + - `src/thicket/cli/commands/generate.py:582` - `self.output_dir.mkdir(parents=True, exist_ok=True)` 75 + - `src/thicket/cli/commands/generate.py:585-586` - Multiple mkdir calls 76 + - `src/thicket/cli/commands/threads.py:2449` - `if not index_path.exists()` 77 + - `src/thicket/cli/commands/info.py:2681` - `if links_path.exists()` 78 + - `src/thicket/core/git_store.py:5515` - `if not self.repo_path.exists()` 79 + - `src/thicket/core/git_store.py:5586` - `user_dir.mkdir(exist_ok=True)` 80 + - Many more similar patterns 81 + 82 + **Recommendation:** Create a shared `path_utils.py` module: 83 + ```python 84 + def ensure_directory(path: Path) -> Path: 85 + """Ensure directory exists, creating if necessary.""" 86 + path.mkdir(parents=True, exist_ok=True) 87 + return path 88 + 89 + def read_json_if_exists(path: Path, default: Any = None) -> Any: 90 + """Read JSON file if it exists, otherwise return default.""" 91 + if path.exists(): 92 + with open(path) as f: 93 + return json.load(f) 94 + return default 95 + 96 + def safe_path_join(*parts: Union[str, Path]) -> Path: 97 + """Safely join path components.""" 98 + return Path(*parts) 99 + ``` 100 + 101 + ## 4. Progress Bar and Console Output 102 + 103 + ### Pattern: Progress bar creation and updates 104 + **Locations:** 105 + - `src/thicket/cli/commands/generate.py:209` - Progress with SpinnerColumn 106 + - `src/thicket/cli/commands/index.py:2230` - Same Progress pattern 107 + - Multiple `console.print()` calls with similar formatting patterns 108 + - Progress update patterns repeated 109 + 110 + **Recommendation:** Create a shared `ui_utils.py` module: 111 + ```python 112 + def create_progress_spinner(description: str) -> tuple[Progress, TaskID]: 113 + """Create a standard progress spinner.""" 114 + progress = Progress( 115 + SpinnerColumn(), 116 + TextColumn("[progress.description]{task.description}"), 117 + transient=True, 118 + ) 119 + task = progress.add_task(description) 120 + return progress, task 121 + 122 + def print_success(message: str) -> None: 123 + """Print success message with consistent formatting.""" 124 + console.print(f"[green]✓[/green] {message}") 125 + 126 + def print_error(message: str) -> None: 127 + """Print error message with consistent formatting.""" 128 + console.print(f"[red]Error: {message}[/red]") 129 + 130 + def print_warning(message: str) -> None: 131 + """Print warning message with consistent formatting.""" 132 + console.print(f"[yellow]Warning: {message}[/yellow]") 133 + ``` 134 + 135 + ## 5. Git Store Operations 136 + 137 + ### Pattern: Entry file operations 138 + **Locations:** 139 + - Multiple patterns of loading entries from user directories 140 + - Repeated safe_id generation 141 + - Repeated user directory path construction 142 + 143 + **Recommendation:** Enhance GitStore with helper methods: 144 + ```python 145 + def get_user_dir(self, username: str) -> Path: 146 + """Get user directory path.""" 147 + return self.repo_path / username 148 + 149 + def iter_user_entries(self, username: str) -> Iterator[tuple[Path, AtomEntry]]: 150 + """Iterate over all entries for a user.""" 151 + user_dir = self.get_user_dir(username) 152 + if user_dir.exists(): 153 + for entry_file in user_dir.glob("*.json"): 154 + if entry_file.name not in ["index.json", "duplicates.json"]: 155 + try: 156 + entry = self.read_entry_file(entry_file) 157 + yield entry_file, entry 158 + except Exception: 159 + continue 160 + ``` 161 + 162 + ## 6. Error Handling Patterns 163 + 164 + ### Pattern: Try-except with console error printing 165 + **Locations:** 166 + - Similar error handling patterns throughout CLI commands 167 + - Repeated `raise typer.Exit(1)` patterns 168 + - Similar exception message formatting 169 + 170 + **Recommendation:** Create error handling decorators: 171 + ```python 172 + def handle_cli_errors(func): 173 + """Decorator to handle CLI command errors consistently.""" 174 + @functools.wraps(func) 175 + def wrapper(*args, **kwargs): 176 + try: 177 + return func(*args, **kwargs) 178 + except ValidationError as e: 179 + console.print(f"[red]Validation error: {e}[/red]") 180 + raise typer.Exit(1) 181 + except Exception as e: 182 + console.print(f"[red]Error: {e}[/red]") 183 + if kwargs.get('verbose'): 184 + console.print_exception() 185 + raise typer.Exit(1) 186 + return wrapper 187 + ``` 188 + 189 + ## 7. Configuration and Validation 190 + 191 + ### Pattern: Config file loading and validation 192 + **Locations:** 193 + - Repeated config loading pattern in every CLI command 194 + - Similar validation patterns for URLs and paths 195 + 196 + **Recommendation:** Create a `config_utils.py` module: 197 + ```python 198 + def load_config_with_defaults(config_path: Optional[Path] = None) -> ThicketConfig: 199 + """Load config with standard defaults and error handling.""" 200 + if config_path is None: 201 + config_path = Path("thicket.yaml") 202 + 203 + if not config_path.exists(): 204 + raise ConfigError(f"Configuration file not found: {config_path}") 205 + 206 + return load_config(config_path) 207 + 208 + def validate_url(url: str) -> HttpUrl: 209 + """Validate and return URL with consistent error handling.""" 210 + try: 211 + return HttpUrl(url) 212 + except ValidationError: 213 + raise ConfigError(f"Invalid URL: {url}") 214 + ``` 215 + 216 + ## 8. Model Serialization 217 + 218 + ### Pattern: Pydantic model JSON encoding 219 + **Locations:** 220 + - Repeated `json_encoders={datetime: lambda v: v.isoformat()}` in model configs 221 + - Similar model_dump patterns 222 + 223 + **Recommendation:** Create base model class: 224 + ```python 225 + class ThicketBaseModel(BaseModel): 226 + """Base model with common configuration.""" 227 + model_config = ConfigDict( 228 + json_encoders={datetime: lambda v: v.isoformat()}, 229 + str_strip_whitespace=True, 230 + ) 231 + 232 + def to_json_dict(self) -> dict: 233 + """Convert to JSON-serializable dict.""" 234 + return self.model_dump(mode="json", exclude_none=True) 235 + ``` 236 + 237 + ## Summary of Refactoring Benefits 238 + 239 + 1. **Reduced Code Duplication**: Eliminate 30-40% of duplicate code 240 + 2. **Consistent Error Handling**: Standardize error messages and handling 241 + 3. **Easier Maintenance**: Central location for common patterns 242 + 4. **Better Testing**: Easier to unit test shared utilities 243 + 5. **Type Safety**: Shared type hints and validation 244 + 6. **Performance**: Potential to optimize common operations in one place 245 + 246 + ## Implementation Priority 247 + 248 + 1. **High Priority**: 249 + - JSON utilities (used everywhere) 250 + - Datetime utilities (critical for sorting and display) 251 + - Error handling decorators (improves UX consistency) 252 + 253 + 2. **Medium Priority**: 254 + - Path utilities 255 + - UI/Console utilities 256 + - Config utilities 257 + 258 + 3. **Low Priority**: 259 + - Base model classes (requires more refactoring) 260 + - Git store enhancements (already well-structured)

+5 -11

pyproject.toml

··· 40 40 "platformdirs>=4.0.0", 41 41 "pyyaml>=6.0.0", 42 42 "email_validator", 43 - "typesense>=1.1.1", 44 - "zulip>=0.9.0", 45 - "zulip-bots>=0.9.0", 46 - "importlib-metadata>=8.7.0", 47 - "markdownify>=1.2.0", 43 + "jinja2>=3.1.6", 48 44 ] 49 45 50 46 [project.optional-dependencies] ··· 143 139 "-ra", 144 140 "--strict-markers", 145 141 "--strict-config", 142 + "--cov=src/thicket", 143 + "--cov-report=term-missing", 144 + "--cov-report=html", 145 + "--cov-report=xml", 146 146 ] 147 147 filterwarnings = [ 148 148 "error", ··· 171 171 "class .*\\bProtocol\\):", 172 172 "@(abc\\.)?abstractmethod", 173 173 ] 174 - 175 - [dependency-groups] 176 - dev = [ 177 - "mypy>=1.17.0", 178 - "pytest>=8.4.1", 179 - ]

+6617

repomix-output.xml

··· 1 + This file is a merged representation of the entire codebase, combined into a single document by Repomix. 2 + 3 + <file_summary> 4 + This section contains a summary of this file. 5 + 6 + <purpose> 7 + This file contains a packed representation of the entire repository's contents. 8 + It is designed to be easily consumable by AI systems for analysis, code review, 9 + or other automated processes. 10 + </purpose> 11 + 12 + <file_format> 13 + The content is organized as follows: 14 + 1. This summary section 15 + 2. Repository information 16 + 3. Directory structure 17 + 4. Repository files (if enabled) 18 + 5. Multiple file entries, each consisting of: 19 + - File path as an attribute 20 + - Full contents of the file 21 + </file_format> 22 + 23 + <usage_guidelines> 24 + - This file should be treated as read-only. Any changes should be made to the 25 + original repository files, not this packed version. 26 + - When processing this file, use the file path to distinguish 27 + between different files in the repository. 28 + - Be aware that this file may contain sensitive information. Handle it with 29 + the same level of security as you would the original repository. 30 + </usage_guidelines> 31 + 32 + <notes> 33 + - Some files may have been excluded based on .gitignore rules and Repomix's configuration 34 + - Binary files are not included in this packed representation. Please refer to the Repository Structure section for a complete list of file paths, including binary files 35 + - Files matching patterns in .gitignore are excluded 36 + - Files matching default ignore patterns are excluded 37 + - Files are sorted by Git change count (files with more changes are at the bottom) 38 + </notes> 39 + 40 + </file_summary> 41 + 42 + <directory_structure> 43 + .claude/ 44 + settings.local.json 45 + src/ 46 + thicket/ 47 + cli/ 48 + commands/ 49 + __init__.py 50 + add.py 51 + duplicates.py 52 + generate.py 53 + index_cmd.py 54 + info_cmd.py 55 + init.py 56 + links_cmd.py 57 + list_cmd.py 58 + sync.py 59 + __init__.py 60 + main.py 61 + utils.py 62 + core/ 63 + __init__.py 64 + feed_parser.py 65 + git_store.py 66 + reference_parser.py 67 + models/ 68 + __init__.py 69 + config.py 70 + feed.py 71 + user.py 72 + templates/ 73 + base.html 74 + index.html 75 + links.html 76 + script.js 77 + style.css 78 + timeline.html 79 + users.html 80 + utils/ 81 + __init__.py 82 + __init__.py 83 + __main__.py 84 + .gitignore 85 + ARCH.md 86 + CLAUDE.md 87 + pyproject.toml 88 + README.md 89 + </directory_structure> 90 + 91 + <files> 92 + This section contains the contents of the repository's files. 93 + 94 + <file path=".claude/settings.local.json"> 95 + { 96 + "permissions": { 97 + "allow": [ 98 + "Bash(find:*)", 99 + "Bash(uv run:*)", 100 + "Bash(grep:*)", 101 + "Bash(jq:*)", 102 + "Bash(git add:*)", 103 + "Bash(ls:*)" 104 + ] 105 + }, 106 + "enableAllProjectMcpServers": false 107 + } 108 + </file> 109 + 110 + <file path="src/thicket/cli/commands/generate.py"> 111 + """Generate static HTML website from thicket data.""" 112 + 113 + import base64 114 + import json 115 + import re 116 + import shutil 117 + from datetime import datetime 118 + from pathlib import Path 119 + from typing import Any, Optional, TypedDict, Union 120 + 121 + import typer 122 + from jinja2 import Environment, FileSystemLoader, select_autoescape 123 + from rich.progress import Progress, SpinnerColumn, TextColumn 124 + 125 + from ...core.git_store import GitStore 126 + from ...models.feed import AtomEntry 127 + from ...models.user import GitStoreIndex, UserMetadata 128 + from ..main import app 129 + from ..utils import console, load_config 130 + 131 + 132 + class UserData(TypedDict): 133 + """Type definition for user data structure.""" 134 + 135 + metadata: UserMetadata 136 + recent_entries: list[tuple[str, AtomEntry]] 137 + 138 + 139 + def safe_anchor_id(atom_id: str) -> str: 140 + """Convert an Atom ID to a safe HTML anchor ID.""" 141 + # Use base64 URL-safe encoding without padding 142 + encoded = base64.urlsafe_b64encode(atom_id.encode('utf-8')).decode('ascii').rstrip('=') 143 + # Prefix with 'id' to ensure it starts with a letter (HTML requirement) 144 + return f"id{encoded}" 145 + 146 + 147 + class WebsiteGenerator: 148 + """Generate static HTML website from thicket data.""" 149 + 150 + def __init__(self, git_store: GitStore, output_dir: Path): 151 + self.git_store = git_store 152 + self.output_dir = output_dir 153 + self.template_dir = Path(__file__).parent.parent.parent / "templates" 154 + 155 + # Initialize Jinja2 environment 156 + self.env = Environment( 157 + loader=FileSystemLoader(self.template_dir), 158 + autoescape=select_autoescape(["html", "xml"]), 159 + ) 160 + 161 + # Data containers 162 + self.index: Optional[GitStoreIndex] = None 163 + self.entries: list[tuple[str, AtomEntry]] = [] # (username, entry) 164 + self.links_data: Optional[dict[str, Any]] = None 165 + self.threads: list[list[dict[str, Any]]] = [] # List of threads with metadata 166 + 167 + def get_display_name(self, username: str) -> str: 168 + """Get display name for a user, falling back to username.""" 169 + if self.index and username in self.index.users: 170 + user = self.index.users[username] 171 + return user.display_name or username 172 + return username 173 + 174 + def get_user_homepage(self, username: str) -> Optional[str]: 175 + """Get homepage URL for a user.""" 176 + if self.index and username in self.index.users: 177 + user = self.index.users[username] 178 + return str(user.homepage) if user.homepage else None 179 + return None 180 + 181 + def clean_html_summary(self, content: Optional[str], max_length: int = 200) -> str: 182 + """Clean HTML content and truncate for display in timeline.""" 183 + if not content: 184 + return "" 185 + 186 + # Remove HTML tags 187 + clean_text = re.sub(r"<[^>]+>", " ", content) 188 + # Replace multiple whitespace with single space 189 + clean_text = re.sub(r"\s+", " ", clean_text) 190 + # Strip leading/trailing whitespace 191 + clean_text = clean_text.strip() 192 + 193 + # Truncate with ellipsis if needed 194 + if len(clean_text) > max_length: 195 + # Try to break at word boundary 196 + truncated = clean_text[:max_length] 197 + last_space = truncated.rfind(" ") 198 + if ( 199 + last_space > max_length * 0.8 200 + ): # If we can break reasonably close to the limit 201 + clean_text = truncated[:last_space] + "..." 202 + else: 203 + clean_text = truncated + "..." 204 + 205 + return clean_text 206 + 207 + def load_data(self) -> None: 208 + """Load all data from the git repository.""" 209 + with Progress( 210 + SpinnerColumn(), 211 + TextColumn("[progress.description]{task.description}"), 212 + console=console, 213 + ) as progress: 214 + # Load index 215 + task = progress.add_task("Loading repository index...", total=None) 216 + self.index = self.git_store._load_index() 217 + if not self.index: 218 + raise ValueError("No index found in repository") 219 + progress.update(task, completed=True) 220 + 221 + # Load all entries 222 + task = progress.add_task("Loading entries...", total=None) 223 + for username, user_metadata in self.index.users.items(): 224 + user_dir = self.git_store.repo_path / user_metadata.directory 225 + if user_dir.exists(): 226 + for entry_file in user_dir.glob("*.json"): 227 + if entry_file.name not in ["index.json", "duplicates.json"]: 228 + try: 229 + with open(entry_file) as f: 230 + entry_data = json.load(f) 231 + entry = AtomEntry(**entry_data) 232 + self.entries.append((username, entry)) 233 + except Exception as e: 234 + console.print( 235 + f"[yellow]Warning: Failed to load {entry_file}: {e}[/yellow]" 236 + ) 237 + progress.update(task, completed=True) 238 + 239 + # Sort entries by date (newest first) - prioritize updated over published 240 + self.entries.sort( 241 + key=lambda x: x[1].updated or x[1].published or datetime.min, reverse=True 242 + ) 243 + 244 + # Load links data 245 + task = progress.add_task("Loading links and references...", total=None) 246 + links_file = self.git_store.repo_path / "links.json" 247 + if links_file.exists(): 248 + with open(links_file) as f: 249 + self.links_data = json.load(f) 250 + progress.update(task, completed=True) 251 + 252 + def build_threads(self) -> None: 253 + """Build threaded conversations from references.""" 254 + if not self.links_data or "references" not in self.links_data: 255 + return 256 + 257 + # Map entry IDs to (username, entry) tuples 258 + entry_map: dict[str, tuple[str, AtomEntry]] = {} 259 + for username, entry in self.entries: 260 + entry_map[entry.id] = (username, entry) 261 + 262 + # Build adjacency lists for references 263 + self.outbound_refs: dict[str, set[str]] = {} 264 + self.inbound_refs: dict[str, set[str]] = {} 265 + self.reference_details: dict[ 266 + str, list[dict[str, Any]] 267 + ] = {} # Store full reference info 268 + 269 + for ref in self.links_data["references"]: 270 + source_id = ref["source_entry_id"] 271 + target_id = ref.get("target_entry_id") 272 + 273 + if target_id and source_id in entry_map and target_id in entry_map: 274 + self.outbound_refs.setdefault(source_id, set()).add(target_id) 275 + self.inbound_refs.setdefault(target_id, set()).add(source_id) 276 + 277 + # Store reference details for UI 278 + self.reference_details.setdefault(source_id, []).append( 279 + { 280 + "target_id": target_id, 281 + "target_username": ref.get("target_username"), 282 + "type": "outbound", 283 + } 284 + ) 285 + self.reference_details.setdefault(target_id, []).append( 286 + { 287 + "source_id": source_id, 288 + "source_username": ref.get("source_username"), 289 + "type": "inbound", 290 + } 291 + ) 292 + 293 + # Find conversation threads (multi-post discussions) 294 + processed = set() 295 + 296 + for entry_id, (_username, _entry) in entry_map.items(): 297 + if entry_id in processed: 298 + continue 299 + 300 + # Build thread starting from this entry 301 + thread = [] 302 + to_visit = [entry_id] 303 + thread_ids = set() 304 + level_map: dict[str, int] = {} # Track levels for this thread 305 + 306 + # First, traverse up to find the root 307 + current = entry_id 308 + while current in self.inbound_refs: 309 + parents = self.inbound_refs[current] - { 310 + current 311 + } # Exclude self-references 312 + if not parents: 313 + break 314 + # Take the first parent 315 + parent = next(iter(parents)) 316 + if parent in thread_ids: # Avoid cycles 317 + break 318 + current = parent 319 + to_visit.insert(0, current) 320 + 321 + # Now traverse down from the root 322 + while to_visit: 323 + current = to_visit.pop(0) 324 + if current in thread_ids or current not in entry_map: 325 + continue 326 + 327 + thread_ids.add(current) 328 + username, entry = entry_map[current] 329 + 330 + # Calculate thread level 331 + thread_level = self._calculate_thread_level(current, level_map) 332 + 333 + # Add threading metadata 334 + thread_entry = { 335 + "username": username, 336 + "display_name": self.get_display_name(username), 337 + "entry": entry, 338 + "entry_id": current, 339 + "references_to": list(self.outbound_refs.get(current, [])), 340 + "referenced_by": list(self.inbound_refs.get(current, [])), 341 + "thread_level": thread_level, 342 + } 343 + thread.append(thread_entry) 344 + processed.add(current) 345 + 346 + # Add children 347 + if current in self.outbound_refs: 348 + children = self.outbound_refs[current] - thread_ids # Avoid cycles 349 + to_visit.extend(sorted(children)) 350 + 351 + if len(thread) > 1: # Only keep actual threads 352 + # Sort thread by date (newest first) - prioritize updated over published 353 + thread.sort(key=lambda x: x["entry"].updated or x["entry"].published or datetime.min, reverse=True) # type: ignore 354 + self.threads.append(thread) 355 + 356 + # Sort threads by the date of their most recent entry - prioritize updated over published 357 + self.threads.sort( 358 + key=lambda t: max( 359 + item["entry"].updated or item["entry"].published or datetime.min for item in t 360 + ), 361 + reverse=True, 362 + ) 363 + 364 + def _calculate_thread_level( 365 + self, entry_id: str, processed_entries: dict[str, int] 366 + ) -> int: 367 + """Calculate indentation level for threaded display.""" 368 + if entry_id in processed_entries: 369 + return processed_entries[entry_id] 370 + 371 + if entry_id not in self.inbound_refs: 372 + processed_entries[entry_id] = 0 373 + return 0 374 + 375 + parents_in_thread = self.inbound_refs[entry_id] & set(processed_entries.keys()) 376 + if not parents_in_thread: 377 + processed_entries[entry_id] = 0 378 + return 0 379 + 380 + # Find the deepest parent level + 1 381 + max_parent_level = 0 382 + for parent_id in parents_in_thread: 383 + parent_level = self._calculate_thread_level(parent_id, processed_entries) 384 + max_parent_level = max(max_parent_level, parent_level) 385 + 386 + level = min(max_parent_level + 1, 4) # Cap at level 4 387 + processed_entries[entry_id] = level 388 + return level 389 + 390 + def get_standalone_references(self) -> list[dict[str, Any]]: 391 + """Get posts that have references but aren't part of multi-post threads.""" 392 + if not hasattr(self, "reference_details"): 393 + return [] 394 + 395 + threaded_entry_ids = set() 396 + for thread in self.threads: 397 + for item in thread: 398 + threaded_entry_ids.add(item["entry_id"]) 399 + 400 + standalone_refs = [] 401 + for username, entry in self.entries: 402 + if ( 403 + entry.id in self.reference_details 404 + and entry.id not in threaded_entry_ids 405 + ): 406 + refs = self.reference_details[entry.id] 407 + # Only include if it has meaningful references (not just self-references) 408 + meaningful_refs = [ 409 + r 410 + for r in refs 411 + if r.get("target_id") != entry.id and r.get("source_id") != entry.id 412 + ] 413 + if meaningful_refs: 414 + standalone_refs.append( 415 + { 416 + "username": username, 417 + "display_name": self.get_display_name(username), 418 + "entry": entry, 419 + "references": meaningful_refs, 420 + } 421 + ) 422 + 423 + return standalone_refs 424 + 425 + def _add_cross_thread_links(self, timeline_items: list[dict[str, Any]]) -> None: 426 + """Add cross-thread linking for entries that appear in multiple threads.""" 427 + # Map entry IDs to their positions in the timeline 428 + entry_positions: dict[str, list[int]] = {} 429 + # Map URLs referenced by entries to the entries that reference them 430 + url_references: dict[str, list[tuple[str, int]]] = {} # url -> [(entry_id, position)] 431 + 432 + # First pass: collect all entry IDs, their positions, and referenced URLs 433 + for i, item in enumerate(timeline_items): 434 + if item["type"] == "post": 435 + entry_id = item["content"]["entry"].id 436 + entry_positions.setdefault(entry_id, []).append(i) 437 + # Track URLs this entry references 438 + if entry_id in self.reference_details: 439 + for ref in self.reference_details[entry_id]: 440 + if ref["type"] == "outbound" and "target_id" in ref: 441 + # Find the target entry's URL if available 442 + target_entry = self._find_entry_by_id(ref["target_id"]) 443 + if target_entry and target_entry.link: 444 + url = str(target_entry.link) 445 + url_references.setdefault(url, []).append((entry_id, i)) 446 + elif item["type"] == "thread": 447 + for thread_item in item["content"]: 448 + entry_id = thread_item["entry"].id 449 + entry_positions.setdefault(entry_id, []).append(i) 450 + # Track URLs this entry references 451 + if entry_id in self.reference_details: 452 + for ref in self.reference_details[entry_id]: 453 + if ref["type"] == "outbound" and "target_id" in ref: 454 + target_entry = self._find_entry_by_id(ref["target_id"]) 455 + if target_entry and target_entry.link: 456 + url = str(target_entry.link) 457 + url_references.setdefault(url, []).append((entry_id, i)) 458 + 459 + # Build cross-thread connections - only for entries that actually appear multiple times 460 + cross_thread_connections: dict[str, set[int]] = {} # entry_id -> set of timeline positions 461 + 462 + # Add connections ONLY for entries that appear multiple times in the timeline 463 + for entry_id, positions in entry_positions.items(): 464 + if len(positions) > 1: 465 + cross_thread_connections[entry_id] = set(positions) 466 + # Debug: uncomment to see which entries have multiple appearances 467 + # print(f"Entry {entry_id[:50]}... appears at positions: {positions}") 468 + 469 + # Apply cross-thread links to timeline items 470 + for entry_id, positions_set in cross_thread_connections.items(): 471 + positions_list = list(positions_set) 472 + for pos in positions_list: 473 + item = timeline_items[pos] 474 + other_positions = sorted([p for p in positions_list if p != pos]) 475 + 476 + if item["type"] == "post": 477 + # Add cross-thread info to individual posts 478 + item["content"]["cross_thread_links"] = self._build_cross_thread_link_data(entry_id, other_positions, timeline_items) 479 + # Add info about shared references 480 + item["content"]["shared_references"] = self._get_shared_references(entry_id, positions_set, timeline_items) 481 + elif item["type"] == "thread": 482 + # Add cross-thread info to thread items 483 + for thread_item in item["content"]: 484 + if thread_item["entry"].id == entry_id: 485 + thread_item["cross_thread_links"] = self._build_cross_thread_link_data(entry_id, other_positions, timeline_items) 486 + thread_item["shared_references"] = self._get_shared_references(entry_id, positions_set, timeline_items) 487 + break 488 + 489 + def _build_cross_thread_link_data(self, entry_id: str, other_positions: list[int], timeline_items: list[dict[str, Any]]) -> list[dict[str, Any]]: 490 + """Build detailed cross-thread link data with anchor information.""" 491 + cross_thread_links = [] 492 + 493 + for pos in other_positions: 494 + item = timeline_items[pos] 495 + if item["type"] == "post": 496 + # For individual posts 497 + safe_id = safe_anchor_id(entry_id) 498 + cross_thread_links.append({ 499 + "position": pos, 500 + "anchor_id": f"post-{pos}-{safe_id}", 501 + "context": "individual post", 502 + "title": item["content"]["entry"].title 503 + }) 504 + elif item["type"] == "thread": 505 + # For thread items, find the specific thread item 506 + for thread_idx, thread_item in enumerate(item["content"]): 507 + if thread_item["entry"].id == entry_id: 508 + safe_id = safe_anchor_id(entry_id) 509 + cross_thread_links.append({ 510 + "position": pos, 511 + "anchor_id": f"post-{pos}-{thread_idx}-{safe_id}", 512 + "context": f"thread (level {thread_item.get('thread_level', 0)})", 513 + "title": thread_item["entry"].title 514 + }) 515 + break 516 + 517 + return cross_thread_links 518 + 519 + def _find_entry_by_id(self, entry_id: str) -> Optional[AtomEntry]: 520 + """Find an entry by its ID.""" 521 + for _username, entry in self.entries: 522 + if entry.id == entry_id: 523 + return entry 524 + return None 525 + 526 + def _get_shared_references(self, entry_id: str, positions: Union[set[int], list[int]], timeline_items: list[dict[str, Any]]) -> list[dict[str, Any]]: 527 + """Get information about shared references between cross-thread entries.""" 528 + shared_refs = [] 529 + 530 + # Collect all referenced URLs from entries at these positions 531 + url_counts: dict[str, int] = {} 532 + referencing_entries: dict[str, list[str]] = {} # url -> [entry_ids] 533 + 534 + for pos in positions: 535 + item = timeline_items[pos] 536 + entries_to_check = [] 537 + 538 + if item["type"] == "post": 539 + entries_to_check.append(item["content"]["entry"]) 540 + elif item["type"] == "thread": 541 + entries_to_check.extend([ti["entry"] for ti in item["content"]]) 542 + 543 + for entry in entries_to_check: 544 + if entry.id in self.reference_details: 545 + for ref in self.reference_details[entry.id]: 546 + if ref["type"] == "outbound" and "target_id" in ref: 547 + target_entry = self._find_entry_by_id(ref["target_id"]) 548 + if target_entry and target_entry.link: 549 + url = str(target_entry.link) 550 + url_counts[url] = url_counts.get(url, 0) + 1 551 + if url not in referencing_entries: 552 + referencing_entries[url] = [] 553 + if entry.id not in referencing_entries[url]: 554 + referencing_entries[url].append(entry.id) 555 + 556 + # Find URLs referenced by multiple entries 557 + for url, count in url_counts.items(): 558 + if count > 1 and len(referencing_entries[url]) > 1: 559 + # Get the target entry info 560 + target_entry = None 561 + target_username = None 562 + for ref in (self.links_data or {}).get("references", []): 563 + if ref.get("target_url") == url: 564 + target_username = ref.get("target_username") 565 + if ref.get("target_entry_id"): 566 + target_entry = self._find_entry_by_id(ref["target_entry_id"]) 567 + break 568 + 569 + shared_refs.append({ 570 + "url": url, 571 + "count": count, 572 + "referencing_entries": referencing_entries[url], 573 + "target_username": target_username, 574 + "target_title": target_entry.title if target_entry else None 575 + }) 576 + 577 + return sorted(shared_refs, key=lambda x: x["count"], reverse=True) 578 + 579 + def generate_site(self) -> None: 580 + """Generate the static website.""" 581 + # Create output directory 582 + self.output_dir.mkdir(parents=True, exist_ok=True) 583 + 584 + # Create static directories 585 + (self.output_dir / "css").mkdir(exist_ok=True) 586 + (self.output_dir / "js").mkdir(exist_ok=True) 587 + 588 + # Generate CSS 589 + css_template = self.env.get_template("style.css") 590 + css_content = css_template.render() 591 + with open(self.output_dir / "css" / "style.css", "w") as f: 592 + f.write(css_content) 593 + 594 + # Generate JavaScript 595 + js_template = self.env.get_template("script.js") 596 + js_content = js_template.render() 597 + with open(self.output_dir / "js" / "script.js", "w") as f: 598 + f.write(js_content) 599 + 600 + # Prepare common template data 601 + base_data = { 602 + "title": "Energy & Environment Group", 603 + "generated_at": datetime.now().isoformat(), 604 + "get_display_name": self.get_display_name, 605 + "get_user_homepage": self.get_user_homepage, 606 + "clean_html_summary": self.clean_html_summary, 607 + "safe_anchor_id": safe_anchor_id, 608 + } 609 + 610 + # Build unified timeline 611 + timeline_items = [] 612 + 613 + # Only consider the threads that will actually be displayed 614 + displayed_threads = self.threads[:20] # Limit to 20 threads 615 + 616 + # Track which entries are part of displayed threads 617 + threaded_entry_ids = set() 618 + for thread in displayed_threads: 619 + for item in thread: 620 + threaded_entry_ids.add(item["entry_id"]) 621 + 622 + # Add threads to timeline (using the date of the most recent post) 623 + for thread in displayed_threads: 624 + most_recent_date = max( 625 + item["entry"].updated or item["entry"].published or datetime.min 626 + for item in thread 627 + ) 628 + timeline_items.append({ 629 + "type": "thread", 630 + "date": most_recent_date, 631 + "content": thread 632 + }) 633 + 634 + # Add individual posts (not in threads) 635 + for username, entry in self.entries[:50]: 636 + if entry.id not in threaded_entry_ids: 637 + # Check if this entry has references 638 + has_refs = ( 639 + entry.id in self.reference_details 640 + if hasattr(self, "reference_details") 641 + else False 642 + ) 643 + 644 + refs = [] 645 + if has_refs: 646 + refs = self.reference_details.get(entry.id, []) 647 + refs = [ 648 + r for r in refs 649 + if r.get("target_id") != entry.id 650 + and r.get("source_id") != entry.id 651 + ] 652 + 653 + timeline_items.append({ 654 + "type": "post", 655 + "date": entry.updated or entry.published or datetime.min, 656 + "content": { 657 + "username": username, 658 + "display_name": self.get_display_name(username), 659 + "entry": entry, 660 + "references": refs if refs else None 661 + } 662 + }) 663 + 664 + # Sort unified timeline by date (newest first) 665 + timeline_items.sort(key=lambda x: x["date"], reverse=True) 666 + 667 + # Limit timeline to what will actually be rendered 668 + timeline_items = timeline_items[:50] # Limit to 50 items total 669 + 670 + # Add cross-thread linking for repeat blog references 671 + self._add_cross_thread_links(timeline_items) 672 + 673 + # Prepare outgoing links data 674 + outgoing_links = [] 675 + if self.links_data and "links" in self.links_data: 676 + for url, link_info in self.links_data["links"].items(): 677 + referencing_entries = [] 678 + for entry_id in link_info.get("referencing_entries", []): 679 + for username, entry in self.entries: 680 + if entry.id == entry_id: 681 + referencing_entries.append( 682 + (self.get_display_name(username), entry) 683 + ) 684 + break 685 + 686 + if referencing_entries: 687 + # Sort by date - prioritize updated over published 688 + referencing_entries.sort( 689 + key=lambda x: x[1].updated or x[1].published or datetime.min, reverse=True 690 + ) 691 + outgoing_links.append( 692 + { 693 + "url": url, 694 + "target_username": link_info.get("target_username"), 695 + "entries": referencing_entries, 696 + } 697 + ) 698 + 699 + # Sort links by most recent reference - prioritize updated over published 700 + outgoing_links.sort( 701 + key=lambda x: x["entries"][0][1].updated 702 + or x["entries"][0][1].published or datetime.min, 703 + reverse=True, 704 + ) 705 + 706 + # Prepare users data 707 + users: list[UserData] = [] 708 + if self.index: 709 + for username, user_metadata in self.index.users.items(): 710 + # Get recent entries for this user with display names 711 + user_entries = [ 712 + (self.get_display_name(u), e) 713 + for u, e in self.entries 714 + if u == username 715 + ][:5] 716 + users.append( 717 + {"metadata": user_metadata, "recent_entries": user_entries} 718 + ) 719 + # Sort by entry count 720 + users.sort(key=lambda x: x["metadata"].entry_count, reverse=True) 721 + 722 + # Generate timeline page 723 + timeline_template = self.env.get_template("timeline.html") 724 + timeline_content = timeline_template.render( 725 + **base_data, 726 + page="timeline", 727 + timeline_items=timeline_items, # Already limited above 728 + ) 729 + with open(self.output_dir / "timeline.html", "w") as f: 730 + f.write(timeline_content) 731 + 732 + # Generate links page 733 + links_template = self.env.get_template("links.html") 734 + links_content = links_template.render( 735 + **base_data, 736 + page="links", 737 + outgoing_links=outgoing_links[:100], 738 + ) 739 + with open(self.output_dir / "links.html", "w") as f: 740 + f.write(links_content) 741 + 742 + # Generate users page 743 + users_template = self.env.get_template("users.html") 744 + users_content = users_template.render( 745 + **base_data, 746 + page="users", 747 + users=users, 748 + ) 749 + with open(self.output_dir / "users.html", "w") as f: 750 + f.write(users_content) 751 + 752 + # Generate main index page (redirect to timeline) 753 + index_template = self.env.get_template("index.html") 754 + index_content = index_template.render(**base_data) 755 + with open(self.output_dir / "index.html", "w") as f: 756 + f.write(index_content) 757 + 758 + console.print(f"[green]✓[/green] Generated website at {self.output_dir}") 759 + console.print(f" - {len(self.entries)} entries") 760 + console.print(f" - {len(self.threads)} conversation threads") 761 + console.print(f" - {len(outgoing_links)} outgoing links") 762 + console.print(f" - {len(users)} users") 763 + console.print( 764 + " - Generated pages: index.html, timeline.html, links.html, users.html" 765 + ) 766 + 767 + 768 + @app.command() 769 + def generate( 770 + output: Path = typer.Option( 771 + Path("./thicket-site"), 772 + "--output", 773 + "-o", 774 + help="Output directory for the generated website", 775 + ), 776 + force: bool = typer.Option( 777 + False, "--force", "-f", help="Overwrite existing output directory" 778 + ), 779 + config_file: Path = typer.Option( 780 + Path("thicket.yaml"), "--config", help="Configuration file path" 781 + ), 782 + ) -> None: 783 + """Generate a static HTML website from thicket data.""" 784 + config = load_config(config_file) 785 + 786 + if not config.git_store: 787 + console.print("[red]No git store path configured[/red]") 788 + raise typer.Exit(1) 789 + 790 + git_store = GitStore(config.git_store) 791 + 792 + # Check if output directory exists 793 + if output.exists() and not force: 794 + console.print( 795 + f"[red]Output directory {output} already exists. Use --force to overwrite.[/red]" 796 + ) 797 + raise typer.Exit(1) 798 + 799 + # Clean output directory if forcing 800 + if output.exists() and force: 801 + shutil.rmtree(output) 802 + 803 + try: 804 + generator = WebsiteGenerator(git_store, output) 805 + 806 + console.print("[bold]Generating static website...[/bold]") 807 + generator.load_data() 808 + generator.build_threads() 809 + generator.generate_site() 810 + 811 + except Exception as e: 812 + console.print(f"[red]Error generating website: {e}[/red]") 813 + raise typer.Exit(1) from e 814 + </file> 815 + 816 + <file path="src/thicket/templates/base.html"> 817 + <!DOCTYPE html> 818 + <html lang="en"> 819 + <head> 820 + <meta charset="UTF-8"> 821 + <meta name="viewport" content="width=device-width, initial-scale=1.0"> 822 + <title>{% block page_title %}{{ title }}{% endblock %}</title> 823 + <link rel="stylesheet" href="css/style.css"> 824 + </head> 825 + <body> 826 + <header class="site-header"> 827 + <div class="header-content"> 828 + <h1 class="site-title">{{ title }}</h1> 829 + <nav class="site-nav"> 830 + <a href="timeline.html" class="nav-link {% if page == 'timeline' %}active{% endif %}">Timeline</a> 831 + <a href="links.html" class="nav-link {% if page == 'links' %}active{% endif %}">Links</a> 832 + <a href="users.html" class="nav-link {% if page == 'users' %}active{% endif %}">Users</a> 833 + </nav> 834 + </div> 835 + </header> 836 + 837 + <main class="main-content"> 838 + {% block content %}{% endblock %} 839 + </main> 840 + 841 + <footer class="site-footer"> 842 + <p>Generated on {{ generated_at }} by <a href="https://github.com/avsm/thicket">Thicket</a></p> 843 + </footer> 844 + 845 + <script src="js/script.js"></script> 846 + </body> 847 + </html> 848 + </file> 849 + 850 + <file path="src/thicket/templates/index.html"> 851 + <!DOCTYPE html> 852 + <html lang="en"> 853 + <head> 854 + <meta charset="UTF-8"> 855 + <meta name="viewport" content="width=device-width, initial-scale=1.0"> 856 + <title>{{ title }}</title> 857 + <meta http-equiv="refresh" content="0; url=timeline.html"> 858 + <link rel="canonical" href="timeline.html"> 859 + </head> 860 + <body> 861 + <p>Redirecting to <a href="timeline.html">Timeline</a>...</p> 862 + </body> 863 + </html> 864 + </file> 865 + 866 + <file path="src/thicket/templates/links.html"> 867 + {% extends "base.html" %} 868 + 869 + {% block page_title %}Outgoing Links - {{ title }}{% endblock %} 870 + 871 + {% block content %} 872 + <div class="page-content"> 873 + <h2>Outgoing Links</h2> 874 + <p class="page-description">External links referenced in blog posts, ordered by most recent reference.</p> 875 + 876 + {% for link in outgoing_links %} 877 + <article class="link-group"> 878 + <h3 class="link-url"> 879 + <a href="{{ link.url }}" target="_blank">{{ link.url|truncate(80) }}</a> 880 + {% if link.target_username %} 881 + <span class="target-user">({{ link.target_username }})</span> 882 + {% endif %} 883 + </h3> 884 + <div class="referencing-entries"> 885 + <span class="ref-count">Referenced in {{ link.entries|length }} post(s):</span> 886 + <ul> 887 + {% for display_name, entry in link.entries[:5] %} 888 + <li> 889 + <span class="author">{{ display_name }}</span> - 890 + <a href="{{ entry.link }}" target="_blank">{{ entry.title }}</a> 891 + <time datetime="{{ entry.updated or entry.published }}"> 892 + ({{ (entry.updated or entry.published).strftime('%Y-%m-%d') }}) 893 + </time> 894 + </li> 895 + {% endfor %} 896 + {% if link.entries|length > 5 %} 897 + <li class="more">... and {{ link.entries|length - 5 }} more</li> 898 + {% endif %} 899 + </ul> 900 + </div> 901 + </article> 902 + {% endfor %} 903 + </div> 904 + {% endblock %} 905 + </file> 906 + 907 + <file path="src/thicket/templates/script.js"> 908 + // Enhanced functionality for thicket website 909 + document.addEventListener('DOMContentLoaded', function() { 910 + 911 + // Enhance thread collapsing (optional feature) 912 + const threadHeaders = document.querySelectorAll('.thread-header'); 913 + threadHeaders.forEach(header => { 914 + header.style.cursor = 'pointer'; 915 + header.addEventListener('click', function() { 916 + const thread = this.parentElement; 917 + const entries = thread.querySelectorAll('.thread-entry'); 918 + 919 + // Toggle visibility of all but the first entry 920 + for (let i = 1; i < entries.length; i++) { 921 + entries[i].style.display = entries[i].style.display === 'none' ? 'block' : 'none'; 922 + } 923 + 924 + // Update thread count text 925 + const count = this.querySelector('.thread-count'); 926 + if (entries[1] && entries[1].style.display === 'none') { 927 + count.textContent = count.textContent.replace('posts', 'posts (collapsed)'); 928 + } else { 929 + count.textContent = count.textContent.replace(' (collapsed)', ''); 930 + } 931 + }); 932 + }); 933 + 934 + // Add relative time display 935 + const timeElements = document.querySelectorAll('time'); 936 + timeElements.forEach(timeEl => { 937 + const datetime = new Date(timeEl.getAttribute('datetime')); 938 + const now = new Date(); 939 + const diffMs = now - datetime; 940 + const diffDays = Math.floor(diffMs / (1000 * 60 * 60 * 24)); 941 + 942 + let relativeTime; 943 + if (diffDays === 0) { 944 + const diffHours = Math.floor(diffMs / (1000 * 60 * 60)); 945 + if (diffHours === 0) { 946 + const diffMinutes = Math.floor(diffMs / (1000 * 60)); 947 + relativeTime = diffMinutes === 0 ? 'just now' : `${diffMinutes}m ago`; 948 + } else { 949 + relativeTime = `${diffHours}h ago`; 950 + } 951 + } else if (diffDays === 1) { 952 + relativeTime = 'yesterday'; 953 + } else if (diffDays < 7) { 954 + relativeTime = `${diffDays}d ago`; 955 + } else if (diffDays < 30) { 956 + const weeks = Math.floor(diffDays / 7); 957 + relativeTime = weeks === 1 ? '1w ago' : `${weeks}w ago`; 958 + } else if (diffDays < 365) { 959 + const months = Math.floor(diffDays / 30); 960 + relativeTime = months === 1 ? '1mo ago' : `${months}mo ago`; 961 + } else { 962 + const years = Math.floor(diffDays / 365); 963 + relativeTime = years === 1 ? '1y ago' : `${years}y ago`; 964 + } 965 + 966 + // Add relative time as title attribute 967 + timeEl.setAttribute('title', timeEl.textContent); 968 + timeEl.textContent = relativeTime; 969 + }); 970 + 971 + // Enhanced anchor link scrolling for shared references 972 + document.querySelectorAll('a[href^="#"]').forEach(anchor => { 973 + anchor.addEventListener('click', function (e) { 974 + e.preventDefault(); 975 + const target = document.querySelector(this.getAttribute('href')); 976 + if (target) { 977 + target.scrollIntoView({ 978 + behavior: 'smooth', 979 + block: 'center' 980 + }); 981 + 982 + // Highlight the target briefly 983 + const timelineEntry = target.closest('.timeline-entry'); 984 + if (timelineEntry) { 985 + timelineEntry.style.outline = '2px solid var(--primary-color)'; 986 + timelineEntry.style.borderRadius = '8px'; 987 + setTimeout(() => { 988 + timelineEntry.style.outline = ''; 989 + timelineEntry.style.borderRadius = ''; 990 + }, 2000); 991 + } 992 + } 993 + }); 994 + }); 995 + }); 996 + </file> 997 + 998 + <file path="src/thicket/templates/style.css"> 999 + /* Modern, clean design with high-density text and readable theme */ 1000 + 1001 + :root { 1002 + --primary-color: #2c3e50; 1003 + --secondary-color: #3498db; 1004 + --accent-color: #e74c3c; 1005 + --background: #ffffff; 1006 + --surface: #f8f9fa; 1007 + --text-primary: #2c3e50; 1008 + --text-secondary: #7f8c8d; 1009 + --border-color: #e0e0e0; 1010 + --thread-indent: 20px; 1011 + --max-width: 1200px; 1012 + } 1013 + 1014 + * { 1015 + margin: 0; 1016 + padding: 0; 1017 + box-sizing: border-box; 1018 + } 1019 + 1020 + body { 1021 + font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', 'Helvetica Neue', Arial, sans-serif; 1022 + font-size: 14px; 1023 + line-height: 1.6; 1024 + color: var(--text-primary); 1025 + background-color: var(--background); 1026 + } 1027 + 1028 + /* Header */ 1029 + .site-header { 1030 + background-color: var(--surface); 1031 + border-bottom: 1px solid var(--border-color); 1032 + padding: 0.75rem 0; 1033 + position: sticky; 1034 + top: 0; 1035 + z-index: 100; 1036 + } 1037 + 1038 + .header-content { 1039 + max-width: var(--max-width); 1040 + margin: 0 auto; 1041 + padding: 0 2rem; 1042 + display: flex; 1043 + justify-content: space-between; 1044 + align-items: center; 1045 + } 1046 + 1047 + .site-title { 1048 + font-size: 1.5rem; 1049 + font-weight: 600; 1050 + color: var(--primary-color); 1051 + margin: 0; 1052 + } 1053 + 1054 + /* Navigation */ 1055 + .site-nav { 1056 + display: flex; 1057 + gap: 1.5rem; 1058 + } 1059 + 1060 + .nav-link { 1061 + text-decoration: none; 1062 + color: var(--text-secondary); 1063 + font-weight: 500; 1064 + font-size: 0.95rem; 1065 + padding: 0.5rem 0.75rem; 1066 + border-radius: 4px; 1067 + transition: all 0.2s ease; 1068 + } 1069 + 1070 + .nav-link:hover { 1071 + color: var(--primary-color); 1072 + background-color: var(--background); 1073 + } 1074 + 1075 + .nav-link.active { 1076 + color: var(--secondary-color); 1077 + background-color: var(--background); 1078 + font-weight: 600; 1079 + } 1080 + 1081 + /* Main Content */ 1082 + .main-content { 1083 + max-width: var(--max-width); 1084 + margin: 2rem auto; 1085 + padding: 0 2rem; 1086 + } 1087 + 1088 + .page-content { 1089 + margin: 0; 1090 + } 1091 + 1092 + .page-description { 1093 + color: var(--text-secondary); 1094 + margin-bottom: 1.5rem; 1095 + font-style: italic; 1096 + } 1097 + 1098 + /* Sections */ 1099 + section { 1100 + margin-bottom: 2rem; 1101 + } 1102 + 1103 + h2 { 1104 + font-size: 1.3rem; 1105 + font-weight: 600; 1106 + margin-bottom: 0.75rem; 1107 + color: var(--primary-color); 1108 + } 1109 + 1110 + h3 { 1111 + font-size: 1.1rem; 1112 + font-weight: 600; 1113 + margin-bottom: 0.75rem; 1114 + color: var(--primary-color); 1115 + } 1116 + 1117 + /* Entries and Threads */ 1118 + article { 1119 + margin-bottom: 1.5rem; 1120 + padding: 1rem; 1121 + background-color: var(--surface); 1122 + border-radius: 4px; 1123 + border: 1px solid var(--border-color); 1124 + } 1125 + 1126 + /* Timeline-style entries */ 1127 + .timeline-entry { 1128 + margin-bottom: 0.5rem; 1129 + padding: 0.5rem 0.75rem; 1130 + border: none; 1131 + background: transparent; 1132 + transition: background-color 0.2s ease; 1133 + } 1134 + 1135 + .timeline-entry:hover { 1136 + background-color: var(--surface); 1137 + } 1138 + 1139 + .timeline-meta { 1140 + display: inline-flex; 1141 + gap: 0.5rem; 1142 + align-items: center; 1143 + font-size: 0.75rem; 1144 + color: var(--text-secondary); 1145 + margin-bottom: 0.25rem; 1146 + } 1147 + 1148 + .timeline-time { 1149 + font-family: 'SF Mono', Monaco, Consolas, 'Courier New', monospace; 1150 + font-size: 0.75rem; 1151 + color: var(--text-secondary); 1152 + } 1153 + 1154 + .timeline-author { 1155 + font-weight: 600; 1156 + color: var(--primary-color); 1157 + font-size: 0.8rem; 1158 + text-decoration: none; 1159 + } 1160 + 1161 + .timeline-author:hover { 1162 + color: var(--secondary-color); 1163 + text-decoration: underline; 1164 + } 1165 + 1166 + .timeline-content { 1167 + line-height: 1.4; 1168 + } 1169 + 1170 + .timeline-title { 1171 + font-size: 0.95rem; 1172 + font-weight: 600; 1173 + } 1174 + 1175 + .timeline-title a { 1176 + color: var(--primary-color); 1177 + text-decoration: none; 1178 + } 1179 + 1180 + .timeline-title a:hover { 1181 + color: var(--secondary-color); 1182 + text-decoration: underline; 1183 + } 1184 + 1185 + .timeline-summary { 1186 + color: var(--text-secondary); 1187 + font-size: 0.9rem; 1188 + line-height: 1.4; 1189 + } 1190 + 1191 + /* Legacy styles for other sections */ 1192 + .entry-meta, .thread-header { 1193 + display: flex; 1194 + gap: 1rem; 1195 + align-items: center; 1196 + margin-bottom: 0.5rem; 1197 + font-size: 0.85rem; 1198 + color: var(--text-secondary); 1199 + } 1200 + 1201 + .author { 1202 + font-weight: 600; 1203 + color: var(--primary-color); 1204 + } 1205 + 1206 + time { 1207 + font-size: 0.85rem; 1208 + } 1209 + 1210 + h4 { 1211 + font-size: 1.1rem; 1212 + font-weight: 600; 1213 + margin-bottom: 0.5rem; 1214 + } 1215 + 1216 + h4 a { 1217 + color: var(--primary-color); 1218 + text-decoration: none; 1219 + } 1220 + 1221 + h4 a:hover { 1222 + color: var(--secondary-color); 1223 + text-decoration: underline; 1224 + } 1225 + 1226 + .entry-summary { 1227 + color: var(--text-primary); 1228 + line-height: 1.5; 1229 + margin-top: 0.5rem; 1230 + } 1231 + 1232 + /* Enhanced Threading Styles */ 1233 + 1234 + /* Conversation Clusters */ 1235 + .conversation-cluster { 1236 + background-color: var(--background); 1237 + border: 2px solid var(--border-color); 1238 + border-radius: 8px; 1239 + margin-bottom: 2rem; 1240 + overflow: hidden; 1241 + box-shadow: 0 2px 4px rgba(0, 0, 0, 0.05); 1242 + } 1243 + 1244 + .conversation-header { 1245 + background: linear-gradient(135deg, var(--surface) 0%, #f1f3f4 100%); 1246 + padding: 0.75rem 1rem; 1247 + border-bottom: 1px solid var(--border-color); 1248 + } 1249 + 1250 + .conversation-meta { 1251 + display: flex; 1252 + justify-content: space-between; 1253 + align-items: center; 1254 + flex-wrap: wrap; 1255 + gap: 0.5rem; 1256 + } 1257 + 1258 + .conversation-count { 1259 + font-weight: 600; 1260 + color: var(--secondary-color); 1261 + font-size: 0.9rem; 1262 + } 1263 + 1264 + .conversation-participants { 1265 + font-size: 0.8rem; 1266 + color: var(--text-secondary); 1267 + flex: 1; 1268 + text-align: right; 1269 + } 1270 + 1271 + .conversation-flow { 1272 + padding: 0.5rem; 1273 + } 1274 + 1275 + /* Threaded Conversation Entries */ 1276 + .conversation-entry { 1277 + position: relative; 1278 + margin-bottom: 0.75rem; 1279 + display: flex; 1280 + align-items: flex-start; 1281 + } 1282 + 1283 + .conversation-entry.level-0 { 1284 + margin-left: 0; 1285 + } 1286 + 1287 + .conversation-entry.level-1 { 1288 + margin-left: 1.5rem; 1289 + } 1290 + 1291 + .conversation-entry.level-2 { 1292 + margin-left: 3rem; 1293 + } 1294 + 1295 + .conversation-entry.level-3 { 1296 + margin-left: 4.5rem; 1297 + } 1298 + 1299 + .conversation-entry.level-4 { 1300 + margin-left: 6rem; 1301 + } 1302 + 1303 + .entry-connector { 1304 + width: 3px; 1305 + background-color: var(--secondary-color); 1306 + margin-right: 0.75rem; 1307 + margin-top: 0.25rem; 1308 + min-height: 2rem; 1309 + border-radius: 2px; 1310 + opacity: 0.6; 1311 + } 1312 + 1313 + .conversation-entry.level-0 .entry-connector { 1314 + background-color: var(--accent-color); 1315 + opacity: 0.8; 1316 + } 1317 + 1318 + .entry-content { 1319 + flex: 1; 1320 + background-color: var(--surface); 1321 + padding: 0.75rem; 1322 + border-radius: 6px; 1323 + border: 1px solid var(--border-color); 1324 + transition: all 0.2s ease; 1325 + } 1326 + 1327 + .entry-content:hover { 1328 + border-color: var(--secondary-color); 1329 + box-shadow: 0 2px 8px rgba(52, 152, 219, 0.1); 1330 + } 1331 + 1332 + /* Reference Indicators */ 1333 + .reference-indicators { 1334 + display: inline-flex; 1335 + gap: 0.25rem; 1336 + margin-left: 0.5rem; 1337 + } 1338 + 1339 + .ref-out, .ref-in { 1340 + display: inline-block; 1341 + width: 1rem; 1342 + height: 1rem; 1343 + border-radius: 50%; 1344 + text-align: center; 1345 + line-height: 1rem; 1346 + font-size: 0.7rem; 1347 + font-weight: bold; 1348 + } 1349 + 1350 + .ref-out { 1351 + background-color: #e8f5e8; 1352 + color: #2d8f2d; 1353 + } 1354 + 1355 + .ref-in { 1356 + background-color: #e8f0ff; 1357 + color: #1f5fbf; 1358 + } 1359 + 1360 + /* Reference Badges for Individual Posts */ 1361 + .timeline-entry.with-references { 1362 + background-color: var(--surface); 1363 + } 1364 + 1365 + /* Conversation posts in unified timeline */ 1366 + .timeline-entry.conversation-post { 1367 + background: transparent; 1368 + border: none; 1369 + margin-bottom: 0.5rem; 1370 + padding: 0.5rem 0.75rem; 1371 + } 1372 + 1373 + .timeline-entry.conversation-post.level-0 { 1374 + margin-left: 0; 1375 + border-left: 2px solid var(--accent-color); 1376 + padding-left: 0.75rem; 1377 + } 1378 + 1379 + .timeline-entry.conversation-post.level-1 { 1380 + margin-left: 1.5rem; 1381 + border-left: 2px solid var(--secondary-color); 1382 + padding-left: 0.75rem; 1383 + } 1384 + 1385 + .timeline-entry.conversation-post.level-2 { 1386 + margin-left: 3rem; 1387 + border-left: 2px solid var(--text-secondary); 1388 + padding-left: 0.75rem; 1389 + } 1390 + 1391 + .timeline-entry.conversation-post.level-3 { 1392 + margin-left: 4.5rem; 1393 + border-left: 2px solid var(--text-secondary); 1394 + padding-left: 0.75rem; 1395 + } 1396 + 1397 + .timeline-entry.conversation-post.level-4 { 1398 + margin-left: 6rem; 1399 + border-left: 2px solid var(--text-secondary); 1400 + padding-left: 0.75rem; 1401 + } 1402 + 1403 + /* Cross-thread linking */ 1404 + .cross-thread-links { 1405 + margin-top: 0.5rem; 1406 + padding-top: 0.5rem; 1407 + border-top: 1px solid var(--border-color); 1408 + } 1409 + 1410 + .cross-thread-indicator { 1411 + font-size: 0.75rem; 1412 + color: var(--text-secondary); 1413 + background-color: var(--surface); 1414 + padding: 0.25rem 0.5rem; 1415 + border-radius: 12px; 1416 + border: 1px solid var(--border-color); 1417 + display: inline-block; 1418 + } 1419 + 1420 + /* Inline shared references styling */ 1421 + .inline-shared-refs { 1422 + margin-left: 0.5rem; 1423 + font-size: 0.85rem; 1424 + color: var(--text-secondary); 1425 + } 1426 + 1427 + .shared-ref-link { 1428 + color: var(--primary-color); 1429 + text-decoration: none; 1430 + font-weight: 500; 1431 + transition: color 0.2s ease; 1432 + } 1433 + 1434 + .shared-ref-link:hover { 1435 + color: var(--secondary-color); 1436 + text-decoration: underline; 1437 + } 1438 + 1439 + .shared-ref-more { 1440 + font-style: italic; 1441 + color: var(--text-secondary); 1442 + font-size: 0.8rem; 1443 + margin-left: 0.25rem; 1444 + } 1445 + 1446 + .user-anchor, .post-anchor { 1447 + position: absolute; 1448 + margin-top: -60px; /* Offset for fixed header */ 1449 + pointer-events: none; 1450 + } 1451 + 1452 + .cross-thread-link { 1453 + color: var(--primary-color); 1454 + text-decoration: none; 1455 + font-weight: 500; 1456 + transition: color 0.2s ease; 1457 + } 1458 + 1459 + .cross-thread-link:hover { 1460 + color: var(--secondary-color); 1461 + text-decoration: underline; 1462 + } 1463 + 1464 + .reference-badges { 1465 + display: flex; 1466 + gap: 0.25rem; 1467 + margin-left: 0.5rem; 1468 + flex-wrap: wrap; 1469 + } 1470 + 1471 + .ref-badge { 1472 + display: inline-block; 1473 + padding: 0.1rem 0.4rem; 1474 + border-radius: 12px; 1475 + font-size: 0.7rem; 1476 + font-weight: 600; 1477 + text-transform: uppercase; 1478 + letter-spacing: 0.05em; 1479 + } 1480 + 1481 + .ref-badge.ref-outbound { 1482 + background-color: #e8f5e8; 1483 + color: #2d8f2d; 1484 + border: 1px solid #c3e6c3; 1485 + } 1486 + 1487 + .ref-badge.ref-inbound { 1488 + background-color: #e8f0ff; 1489 + color: #1f5fbf; 1490 + border: 1px solid #b3d9ff; 1491 + } 1492 + 1493 + /* Author Color Coding */ 1494 + .timeline-author { 1495 + position: relative; 1496 + } 1497 + 1498 + .timeline-author::before { 1499 + content: ''; 1500 + display: inline-block; 1501 + width: 8px; 1502 + height: 8px; 1503 + border-radius: 50%; 1504 + margin-right: 0.5rem; 1505 + background-color: var(--secondary-color); 1506 + } 1507 + 1508 + /* Generate consistent colors for authors */ 1509 + .author-avsm::before { background-color: #e74c3c; } 1510 + .author-mort::before { background-color: #3498db; } 1511 + .author-mte::before { background-color: #2ecc71; } 1512 + .author-ryan::before { background-color: #f39c12; } 1513 + .author-mwd::before { background-color: #9b59b6; } 1514 + .author-dra::before { background-color: #1abc9c; } 1515 + .author-pf341::before { background-color: #34495e; } 1516 + .author-sadiqj::before { background-color: #e67e22; } 1517 + .author-martinkl::before { background-color: #8e44ad; } 1518 + .author-jonsterling::before { background-color: #27ae60; } 1519 + .author-jon::before { background-color: #f1c40f; } 1520 + .author-onkar::before { background-color: #e91e63; } 1521 + .author-gabriel::before { background-color: #00bcd4; } 1522 + .author-jess::before { background-color: #ff5722; } 1523 + .author-ibrahim::before { background-color: #607d8b; } 1524 + .author-andres::before { background-color: #795548; } 1525 + .author-eeg::before { background-color: #ff9800; } 1526 + 1527 + /* Section Headers */ 1528 + .conversations-section h3, 1529 + .referenced-posts-section h3, 1530 + .individual-posts-section h3 { 1531 + border-bottom: 2px solid var(--border-color); 1532 + padding-bottom: 0.5rem; 1533 + margin-bottom: 1.5rem; 1534 + position: relative; 1535 + } 1536 + 1537 + .conversations-section h3::before { 1538 + content: "💬"; 1539 + margin-right: 0.5rem; 1540 + } 1541 + 1542 + .referenced-posts-section h3::before { 1543 + content: "🔗"; 1544 + margin-right: 0.5rem; 1545 + } 1546 + 1547 + .individual-posts-section h3::before { 1548 + content: "📝"; 1549 + margin-right: 0.5rem; 1550 + } 1551 + 1552 + /* Legacy thread styles (for backward compatibility) */ 1553 + .thread { 1554 + background-color: var(--background); 1555 + border: 1px solid var(--border-color); 1556 + padding: 0; 1557 + overflow: hidden; 1558 + margin-bottom: 1rem; 1559 + } 1560 + 1561 + .thread-header { 1562 + background-color: var(--surface); 1563 + padding: 0.5rem 0.75rem; 1564 + border-bottom: 1px solid var(--border-color); 1565 + } 1566 + 1567 + .thread-count { 1568 + font-weight: 600; 1569 + color: var(--secondary-color); 1570 + } 1571 + 1572 + .thread-entry { 1573 + padding: 0.5rem 0.75rem; 1574 + border-bottom: 1px solid var(--border-color); 1575 + } 1576 + 1577 + .thread-entry:last-child { 1578 + border-bottom: none; 1579 + } 1580 + 1581 + .thread-entry.reply { 1582 + margin-left: var(--thread-indent); 1583 + border-left: 3px solid var(--secondary-color); 1584 + background-color: var(--surface); 1585 + } 1586 + 1587 + /* Links Section */ 1588 + .link-group { 1589 + background-color: var(--background); 1590 + } 1591 + 1592 + .link-url { 1593 + font-size: 1rem; 1594 + word-break: break-word; 1595 + } 1596 + 1597 + .link-url a { 1598 + color: var(--secondary-color); 1599 + text-decoration: none; 1600 + } 1601 + 1602 + .link-url a:hover { 1603 + text-decoration: underline; 1604 + } 1605 + 1606 + .target-user { 1607 + font-size: 0.9rem; 1608 + color: var(--text-secondary); 1609 + font-weight: normal; 1610 + } 1611 + 1612 + .referencing-entries { 1613 + margin-top: 0.75rem; 1614 + } 1615 + 1616 + .ref-count { 1617 + font-weight: 600; 1618 + color: var(--text-secondary); 1619 + font-size: 0.9rem; 1620 + } 1621 + 1622 + .referencing-entries ul { 1623 + list-style: none; 1624 + margin-top: 0.5rem; 1625 + padding-left: 1rem; 1626 + } 1627 + 1628 + .referencing-entries li { 1629 + margin-bottom: 0.25rem; 1630 + font-size: 0.9rem; 1631 + } 1632 + 1633 + .referencing-entries .more { 1634 + font-style: italic; 1635 + color: var(--text-secondary); 1636 + } 1637 + 1638 + /* Users Section */ 1639 + .user-card { 1640 + background-color: var(--background); 1641 + } 1642 + 1643 + .user-header { 1644 + display: flex; 1645 + gap: 1rem; 1646 + align-items: start; 1647 + margin-bottom: 1rem; 1648 + } 1649 + 1650 + .user-icon { 1651 + width: 48px; 1652 + height: 48px; 1653 + border-radius: 50%; 1654 + object-fit: cover; 1655 + } 1656 + 1657 + .user-info h3 { 1658 + margin-bottom: 0.25rem; 1659 + } 1660 + 1661 + .username { 1662 + font-size: 0.9rem; 1663 + color: var(--text-secondary); 1664 + font-weight: normal; 1665 + } 1666 + 1667 + .user-meta { 1668 + font-size: 0.9rem; 1669 + color: var(--text-secondary); 1670 + } 1671 + 1672 + .user-meta a { 1673 + color: var(--secondary-color); 1674 + text-decoration: none; 1675 + } 1676 + 1677 + .user-meta a:hover { 1678 + text-decoration: underline; 1679 + } 1680 + 1681 + .separator { 1682 + margin: 0 0.5rem; 1683 + } 1684 + 1685 + .post-count { 1686 + font-weight: 600; 1687 + } 1688 + 1689 + .user-recent h4 { 1690 + font-size: 0.95rem; 1691 + margin-bottom: 0.5rem; 1692 + color: var(--text-secondary); 1693 + } 1694 + 1695 + .user-recent ul { 1696 + list-style: none; 1697 + padding-left: 0; 1698 + } 1699 + 1700 + .user-recent li { 1701 + margin-bottom: 0.25rem; 1702 + font-size: 0.9rem; 1703 + } 1704 + 1705 + /* Footer */ 1706 + .site-footer { 1707 + max-width: var(--max-width); 1708 + margin: 3rem auto 2rem; 1709 + padding: 1rem 2rem; 1710 + text-align: center; 1711 + color: var(--text-secondary); 1712 + font-size: 0.85rem; 1713 + border-top: 1px solid var(--border-color); 1714 + } 1715 + 1716 + .site-footer a { 1717 + color: var(--secondary-color); 1718 + text-decoration: none; 1719 + } 1720 + 1721 + .site-footer a:hover { 1722 + text-decoration: underline; 1723 + } 1724 + 1725 + /* Responsive */ 1726 + @media (max-width: 768px) { 1727 + .site-title { 1728 + font-size: 1.3rem; 1729 + } 1730 + 1731 + .header-content { 1732 + flex-direction: column; 1733 + gap: 0.75rem; 1734 + align-items: flex-start; 1735 + } 1736 + 1737 + .site-nav { 1738 + gap: 1rem; 1739 + } 1740 + 1741 + .main-content { 1742 + padding: 0 1rem; 1743 + } 1744 + 1745 + .thread-entry.reply { 1746 + margin-left: calc(var(--thread-indent) / 2); 1747 + } 1748 + 1749 + .user-header { 1750 + flex-direction: column; 1751 + } 1752 + } 1753 + </file> 1754 + 1755 + <file path="src/thicket/templates/timeline.html"> 1756 + {% extends "base.html" %} 1757 + 1758 + {% block page_title %}Timeline - {{ title }}{% endblock %} 1759 + 1760 + {% block content %} 1761 + {% set seen_users = [] %} 1762 + <div class="page-content"> 1763 + <h2>Recent Posts & Conversations</h2> 1764 + 1765 + <section class="unified-timeline"> 1766 + {% for item in timeline_items %} 1767 + {% if item.type == "post" %} 1768 +  1769 + <article class="timeline-entry {% if item.content.references %}with-references{% endif %}"> 1770 + <div class="timeline-meta"> 1771 + <time datetime="{{ item.content.entry.updated or item.content.entry.published }}" class="timeline-time"> 1772 + {{ (item.content.entry.updated or item.content.entry.published).strftime('%Y-%m-%d %H:%M') }} 1773 + </time> 1774 + {% set homepage = get_user_homepage(item.content.username) %} 1775 + {% if item.content.username not in seen_users %} 1776 + <a id="{{ item.content.username }}" class="user-anchor"></a> 1777 + {% set _ = seen_users.append(item.content.username) %} 1778 + {% endif %} 1779 + <a id="post-{{ loop.index0 }}-{{ safe_anchor_id(item.content.entry.id) }}" class="post-anchor"></a> 1780 + {% if homepage %} 1781 + <a href="{{ homepage }}" target="_blank" class="timeline-author">{{ item.content.display_name }}</a> 1782 + {% else %} 1783 + <span class="timeline-author">{{ item.content.display_name }}</span> 1784 + {% endif %} 1785 + {% if item.content.references %} 1786 + <div class="reference-badges"> 1787 + {% for ref in item.content.references %} 1788 + {% if ref.type == 'outbound' %} 1789 + <span class="ref-badge ref-outbound" title="References {{ ref.target_username or 'external post' }}"> 1790 + → {{ ref.target_username or 'ext' }} 1791 + </span> 1792 + {% elif ref.type == 'inbound' %} 1793 + <span class="ref-badge ref-inbound" title="Referenced by {{ ref.source_username or 'external post' }}"> 1794 + ← {{ ref.source_username or 'ext' }} 1795 + </span> 1796 + {% endif %} 1797 + {% endfor %} 1798 + </div> 1799 + {% endif %} 1800 + </div> 1801 + <div class="timeline-content"> 1802 + <strong class="timeline-title"> 1803 + <a href="{{ item.content.entry.link }}" target="_blank">{{ item.content.entry.title }}</a> 1804 + </strong> 1805 + {% if item.content.entry.summary %} 1806 + <span class="timeline-summary">— {{ clean_html_summary(item.content.entry.summary, 250) }}</span> 1807 + {% endif %} 1808 + {% if item.content.shared_references %} 1809 + <span class="inline-shared-refs"> 1810 + {% for ref in item.content.shared_references[:3] %} 1811 + {% if ref.target_username %} 1812 + <a href="#{{ ref.target_username }}" class="shared-ref-link" title="Referenced by {{ ref.count }} entries">@{{ ref.target_username }}</a>{% if not loop.last %}, {% endif %} 1813 + {% endif %} 1814 + {% endfor %} 1815 + {% if item.content.shared_references|length > 3 %} 1816 + <span class="shared-ref-more">+{{ item.content.shared_references|length - 3 }} more</span> 1817 + {% endif %} 1818 + </span> 1819 + {% endif %} 1820 + {% if item.content.cross_thread_links %} 1821 + <div class="cross-thread-links"> 1822 + <span class="cross-thread-indicator">🔗 Also appears: </span> 1823 + {% for link in item.content.cross_thread_links %} 1824 + <a href="#{{ link.anchor_id }}" class="cross-thread-link" title="{{ link.title }}">{{ link.context }}</a>{% if not loop.last %}, {% endif %} 1825 + {% endfor %} 1826 + </div> 1827 + {% endif %} 1828 + </div> 1829 + </article> 1830 + 1831 + {% elif item.type == "thread" %} 1832 +  1833 + {% set outer_loop_index = loop.index0 %} 1834 + {% for thread_item in item.content %} 1835 + <article class="timeline-entry conversation-post level-{{ thread_item.thread_level }}"> 1836 + <div class="timeline-meta"> 1837 + <time datetime="{{ thread_item.entry.updated or thread_item.entry.published }}" class="timeline-time"> 1838 + {{ (thread_item.entry.updated or thread_item.entry.published).strftime('%Y-%m-%d %H:%M') }} 1839 + </time> 1840 + {% set homepage = get_user_homepage(thread_item.username) %} 1841 + {% if thread_item.username not in seen_users %} 1842 + <a id="{{ thread_item.username }}" class="user-anchor"></a> 1843 + {% set _ = seen_users.append(thread_item.username) %} 1844 + {% endif %} 1845 + <a id="post-{{ outer_loop_index }}-{{ loop.index0 }}-{{ safe_anchor_id(thread_item.entry.id) }}" class="post-anchor"></a> 1846 + {% if homepage %} 1847 + <a href="{{ homepage }}" target="_blank" class="timeline-author author-{{ thread_item.username }}">{{ thread_item.display_name }}</a> 1848 + {% else %} 1849 + <span class="timeline-author author-{{ thread_item.username }}">{{ thread_item.display_name }}</span> 1850 + {% endif %} 1851 + {% if thread_item.references_to or thread_item.referenced_by %} 1852 + <span class="reference-indicators"> 1853 + {% if thread_item.references_to %} 1854 + <span class="ref-out" title="References other posts">→</span> 1855 + {% endif %} 1856 + {% if thread_item.referenced_by %} 1857 + <span class="ref-in" title="Referenced by other posts">←</span> 1858 + {% endif %} 1859 + </span> 1860 + {% endif %} 1861 + </div> 1862 + <div class="timeline-content"> 1863 + <strong class="timeline-title"> 1864 + <a href="{{ thread_item.entry.link }}" target="_blank">{{ thread_item.entry.title }}</a> 1865 + </strong> 1866 + {% if thread_item.entry.summary %} 1867 + <span class="timeline-summary">— {{ clean_html_summary(thread_item.entry.summary, 300) }}</span> 1868 + {% endif %} 1869 + {% if thread_item.shared_references %} 1870 + <span class="inline-shared-refs"> 1871 + {% for ref in thread_item.shared_references[:3] %} 1872 + {% if ref.target_username %} 1873 + <a href="#{{ ref.target_username }}" class="shared-ref-link" title="Referenced by {{ ref.count }} entries">@{{ ref.target_username }}</a>{% if not loop.last %}, {% endif %} 1874 + {% endif %} 1875 + {% endfor %} 1876 + {% if thread_item.shared_references|length > 3 %} 1877 + <span class="shared-ref-more">+{{ thread_item.shared_references|length - 3 }} more</span> 1878 + {% endif %} 1879 + </span> 1880 + {% endif %} 1881 + {% if thread_item.cross_thread_links %} 1882 + <div class="cross-thread-links"> 1883 + <span class="cross-thread-indicator">🔗 Also appears: </span> 1884 + {% for link in thread_item.cross_thread_links %} 1885 + <a href="#{{ link.anchor_id }}" class="cross-thread-link" title="{{ link.title }}">{{ link.context }}</a>{% if not loop.last %}, {% endif %} 1886 + {% endfor %} 1887 + </div> 1888 + {% endif %} 1889 + </div> 1890 + </article> 1891 + {% endfor %} 1892 + {% endif %} 1893 + {% endfor %} 1894 + </section> 1895 + </div> 1896 + {% endblock %} 1897 + </file> 1898 + 1899 + <file path="src/thicket/templates/users.html"> 1900 + {% extends "base.html" %} 1901 + 1902 + {% block page_title %}Users - {{ title }}{% endblock %} 1903 + 1904 + {% block content %} 1905 + <div class="page-content"> 1906 + <h2>Users</h2> 1907 + <p class="page-description">All users contributing to this thicket, ordered by post count.</p> 1908 + 1909 + {% for user_info in users %} 1910 + <article class="user-card"> 1911 + <div class="user-header"> 1912 + {% if user_info.metadata.icon and user_info.metadata.icon != "None" %} 1913 + <img src="{{ user_info.metadata.icon }}" alt="{{ user_info.metadata.username }}" class="user-icon"> 1914 + {% endif %} 1915 + <div class="user-info"> 1916 + <h3> 1917 + {% if user_info.metadata.display_name %} 1918 + {{ user_info.metadata.display_name }} 1919 + <span class="username">({{ user_info.metadata.username }})</span> 1920 + {% else %} 1921 + {{ user_info.metadata.username }} 1922 + {% endif %} 1923 + </h3> 1924 + <div class="user-meta"> 1925 + {% if user_info.metadata.homepage %} 1926 + <a href="{{ user_info.metadata.homepage }}" target="_blank">{{ user_info.metadata.homepage }}</a> 1927 + {% endif %} 1928 + {% if user_info.metadata.email %} 1929 + <span class="separator">•</span> 1930 + <a href="mailto:{{ user_info.metadata.email }}">{{ user_info.metadata.email }}</a> 1931 + {% endif %} 1932 + <span class="separator">•</span> 1933 + <span class="post-count">{{ user_info.metadata.entry_count }} posts</span> 1934 + </div> 1935 + </div> 1936 + </div> 1937 + 1938 + {% if user_info.recent_entries %} 1939 + <div class="user-recent"> 1940 + <h4>Recent posts:</h4> 1941 + <ul> 1942 + {% for display_name, entry in user_info.recent_entries %} 1943 + <li> 1944 + <a href="{{ entry.link }}" target="_blank">{{ entry.title }}</a> 1945 + <time datetime="{{ entry.updated or entry.published }}"> 1946 + ({{ (entry.updated or entry.published).strftime('%Y-%m-%d') }}) 1947 + </time> 1948 + </li> 1949 + {% endfor %} 1950 + </ul> 1951 + </div> 1952 + {% endif %} 1953 + </article> 1954 + {% endfor %} 1955 + </div> 1956 + {% endblock %} 1957 + </file> 1958 + 1959 + <file path="README.md"> 1960 + # Thicket 1961 + 1962 + A modern CLI tool for persisting Atom/RSS feeds in Git repositories, designed to enable distributed webblog comment structures. 1963 + 1964 + ## Features 1965 + 1966 + - **Feed Auto-Discovery**: Automatically extracts user metadata from Atom/RSS feeds 1967 + - **Git Storage**: Stores feed entries in a Git repository with full history 1968 + - **Duplicate Management**: Manual curation of duplicate entries across feeds 1969 + - **Modern CLI**: Built with Typer and Rich for beautiful terminal output 1970 + - **Comprehensive Parsing**: Supports RSS 0.9x, RSS 1.0, RSS 2.0, and Atom feeds 1971 + - **Cron-Friendly**: Designed for scheduled execution 1972 + 1973 + ## Installation 1974 + 1975 + ```bash 1976 + # Install from source 1977 + pip install -e . 1978 + 1979 + # Or install with dev dependencies 1980 + pip install -e .[dev] 1981 + ``` 1982 + 1983 + ## Quick Start 1984 + 1985 + 1. **Initialize a new thicket repository:** 1986 + ```bash 1987 + thicket init ./my-feeds 1988 + ``` 1989 + 1990 + 2. **Add a user with their feed:** 1991 + ```bash 1992 + thicket add user "alice" --feed "https://alice.example.com/feed.xml" 1993 + ``` 1994 + 1995 + 3. **Sync feeds to download entries:** 1996 + ```bash 1997 + thicket sync --all 1998 + ``` 1999 + 2000 + 4. **List users and feeds:** 2001 + ```bash 2002 + thicket list users 2003 + thicket list feeds 2004 + thicket list entries 2005 + ``` 2006 + 2007 + ## Commands 2008 + 2009 + ### Initialize 2010 + ```bash 2011 + thicket init <git-store-path> [--cache-dir <path>] [--config <config-file>] 2012 + ``` 2013 + 2014 + ### Add Users and Feeds 2015 + ```bash 2016 + # Add user with auto-discovery 2017 + thicket add user "username" --feed "https://example.com/feed.xml" 2018 + 2019 + # Add user with manual metadata 2020 + thicket add user "username" \ 2021 + --feed "https://example.com/feed.xml" \ 2022 + --email "user@example.com" \ 2023 + --homepage "https://example.com" \ 2024 + --display-name "User Name" 2025 + 2026 + # Add additional feed to existing user 2027 + thicket add feed "username" "https://example.com/other-feed.xml" 2028 + ``` 2029 + 2030 + ### Sync Feeds 2031 + ```bash 2032 + # Sync all users 2033 + thicket sync --all 2034 + 2035 + # Sync specific user 2036 + thicket sync --user "username" 2037 + 2038 + # Dry run (preview changes) 2039 + thicket sync --all --dry-run 2040 + ``` 2041 + 2042 + ### List Information 2043 + ```bash 2044 + # List all users 2045 + thicket list users 2046 + 2047 + # List all feeds 2048 + thicket list feeds 2049 + 2050 + # List feeds for specific user 2051 + thicket list feeds --user "username" 2052 + 2053 + # List recent entries 2054 + thicket list entries --limit 20 2055 + 2056 + # List entries for specific user 2057 + thicket list entries --user "username" 2058 + ``` 2059 + 2060 + ### Manage Duplicates 2061 + ```bash 2062 + # List duplicate mappings 2063 + thicket duplicates list 2064 + 2065 + # Mark entries as duplicates 2066 + thicket duplicates add "https://example.com/dup" "https://example.com/canonical" 2067 + 2068 + # Remove duplicate mapping 2069 + thicket duplicates remove "https://example.com/dup" 2070 + ``` 2071 + 2072 + ## Configuration 2073 + 2074 + Thicket uses a YAML configuration file (default: `thicket.yaml`): 2075 + 2076 + ```yaml 2077 + git_store: ./feeds-repo 2078 + cache_dir: ~/.cache/thicket 2079 + users: 2080 + - username: alice 2081 + feeds: 2082 + - https://alice.example.com/feed.xml 2083 + email: alice@example.com 2084 + homepage: https://alice.example.com 2085 + display_name: Alice 2086 + ``` 2087 + 2088 + ## Git Repository Structure 2089 + 2090 + ``` 2091 + feeds-repo/ 2092 + ├── index.json # User directory index 2093 + ├── duplicates.json # Duplicate entry mappings 2094 + ├── alice/ 2095 + │ ├── metadata.json # User metadata 2096 + │ ├── entry_id_1.json # Feed entries 2097 + │ └── entry_id_2.json 2098 + └── bob/ 2099 + └── ... 2100 + ``` 2101 + 2102 + ## Development 2103 + 2104 + ### Setup 2105 + ```bash 2106 + # Install in development mode 2107 + pip install -e .[dev] 2108 + 2109 + # Run tests 2110 + pytest 2111 + 2112 + # Run linting 2113 + ruff check src/ 2114 + black --check src/ 2115 + 2116 + # Run type checking 2117 + mypy src/ 2118 + ``` 2119 + 2120 + ### Architecture 2121 + 2122 + - **CLI**: Modern interface with Typer and Rich 2123 + - **Feed Processing**: Universal parsing with feedparser 2124 + - **Git Storage**: Structured storage with GitPython 2125 + - **Data Models**: Pydantic for validation and serialization 2126 + - **Async HTTP**: httpx for efficient feed fetching 2127 + 2128 + ## Use Cases 2129 + 2130 + - **Blog Aggregation**: Collect and archive blog posts from multiple sources 2131 + - **Comment Networks**: Enable distributed commenting systems 2132 + - **Feed Archival**: Preserve feed history beyond typical feed depth limits 2133 + - **Content Curation**: Manage and deduplicate content across feeds 2134 + 2135 + ## License 2136 + 2137 + MIT License - see LICENSE file for details. 2138 + </file> 2139 + 2140 + <file path="src/thicket/cli/commands/index_cmd.py"> 2141 + """CLI command for building reference index from blog entries.""" 2142 + 2143 + import json 2144 + from pathlib import Path 2145 + from typing import Optional 2146 + 2147 + import typer 2148 + from rich.console import Console 2149 + from rich.progress import ( 2150 + BarColumn, 2151 + Progress, 2152 + SpinnerColumn, 2153 + TaskProgressColumn, 2154 + TextColumn, 2155 + ) 2156 + from rich.table import Table 2157 + 2158 + from ...core.git_store import GitStore 2159 + from ...core.reference_parser import ReferenceIndex, ReferenceParser 2160 + from ..main import app 2161 + from ..utils import get_tsv_mode, load_config 2162 + 2163 + console = Console() 2164 + 2165 + 2166 + @app.command() 2167 + def index( 2168 + config_file: Optional[Path] = typer.Option( 2169 + None, 2170 + "--config", 2171 + "-c", 2172 + help="Path to configuration file", 2173 + ), 2174 + output_file: Optional[Path] = typer.Option( 2175 + None, 2176 + "--output", 2177 + "-o", 2178 + help="Path to output index file (default: updates links.json in git store)", 2179 + ), 2180 + verbose: bool = typer.Option( 2181 + False, 2182 + "--verbose", 2183 + "-v", 2184 + help="Show detailed progress information", 2185 + ), 2186 + ) -> None: 2187 + """Build a reference index showing which blog entries reference others. 2188 + 2189 + This command analyzes all blog entries to detect cross-references between 2190 + different blogs, creating an index that can be used to build threaded 2191 + views of related content. 2192 + 2193 + Updates the unified links.json file with reference data. 2194 + """ 2195 + try: 2196 + # Load configuration 2197 + config = load_config(config_file) 2198 + 2199 + # Initialize Git store 2200 + git_store = GitStore(config.git_store) 2201 + 2202 + # Initialize reference parser 2203 + parser = ReferenceParser() 2204 + 2205 + # Build user domain mapping 2206 + if verbose: 2207 + console.print("Building user domain mapping...") 2208 + user_domains = parser.build_user_domain_mapping(git_store) 2209 + 2210 + if verbose: 2211 + console.print(f"Found {len(user_domains)} users with {sum(len(d) for d in user_domains.values())} total domains") 2212 + 2213 + # Initialize reference index 2214 + ref_index = ReferenceIndex() 2215 + ref_index.user_domains = user_domains 2216 + 2217 + # Get all users 2218 + index = git_store._load_index() 2219 + users = list(index.users.keys()) 2220 + 2221 + if not users: 2222 + console.print("[yellow]No users found in Git store[/yellow]") 2223 + raise typer.Exit(0) 2224 + 2225 + # Process all entries 2226 + total_entries = 0 2227 + total_references = 0 2228 + all_references = [] 2229 + 2230 + with Progress( 2231 + SpinnerColumn(), 2232 + TextColumn("[progress.description]{task.description}"), 2233 + BarColumn(), 2234 + TaskProgressColumn(), 2235 + console=console, 2236 + ) as progress: 2237 + 2238 + # Count total entries first 2239 + counting_task = progress.add_task("Counting entries...", total=len(users)) 2240 + entry_counts = {} 2241 + for username in users: 2242 + entries = git_store.list_entries(username) 2243 + entry_counts[username] = len(entries) 2244 + total_entries += len(entries) 2245 + progress.advance(counting_task) 2246 + 2247 + progress.remove_task(counting_task) 2248 + 2249 + # Process entries - extract references 2250 + processing_task = progress.add_task( 2251 + f"Extracting references from {total_entries} entries...", 2252 + total=total_entries 2253 + ) 2254 + 2255 + for username in users: 2256 + entries = git_store.list_entries(username) 2257 + 2258 + for entry in entries: 2259 + # Extract references from this entry 2260 + references = parser.extract_references(entry, username, user_domains) 2261 + all_references.extend(references) 2262 + 2263 + progress.advance(processing_task) 2264 + 2265 + if verbose and references: 2266 + console.print(f" Found {len(references)} references in {username}:{entry.title[:50]}...") 2267 + 2268 + progress.remove_task(processing_task) 2269 + 2270 + # Resolve target_entry_ids for references 2271 + if all_references: 2272 + resolve_task = progress.add_task( 2273 + f"Resolving {len(all_references)} references...", 2274 + total=len(all_references) 2275 + ) 2276 + 2277 + if verbose: 2278 + console.print(f"Resolving target entry IDs for {len(all_references)} references...") 2279 + 2280 + resolved_references = parser.resolve_target_entry_ids(all_references, git_store) 2281 + 2282 + # Count resolved references 2283 + resolved_count = sum(1 for ref in resolved_references if ref.target_entry_id is not None) 2284 + if verbose: 2285 + console.print(f"Resolved {resolved_count} out of {len(all_references)} references") 2286 + 2287 + # Add resolved references to index 2288 + for ref in resolved_references: 2289 + ref_index.add_reference(ref) 2290 + total_references += 1 2291 + progress.advance(resolve_task) 2292 + 2293 + progress.remove_task(resolve_task) 2294 + 2295 + # Determine output path 2296 + if output_file: 2297 + output_path = output_file 2298 + else: 2299 + output_path = config.git_store / "links.json" 2300 + 2301 + # Load existing links data or create new structure 2302 + if output_path.exists() and not output_file: 2303 + # Load existing unified structure 2304 + with open(output_path) as f: 2305 + existing_data = json.load(f) 2306 + else: 2307 + # Create new structure 2308 + existing_data = { 2309 + "links": {}, 2310 + "reverse_mapping": {}, 2311 + "user_domains": {} 2312 + } 2313 + 2314 + # Update with reference data 2315 + existing_data["references"] = ref_index.to_dict()["references"] 2316 + existing_data["user_domains"] = {k: list(v) for k, v in user_domains.items()} 2317 + 2318 + # Save updated structure 2319 + with open(output_path, "w") as f: 2320 + json.dump(existing_data, f, indent=2, default=str) 2321 + 2322 + # Show summary 2323 + if not get_tsv_mode(): 2324 + console.print("\n[green]✓ Reference index built successfully[/green]") 2325 + 2326 + # Create summary table or TSV output 2327 + if get_tsv_mode(): 2328 + print("Metric\tCount") 2329 + print(f"Total Users\t{len(users)}") 2330 + print(f"Total Entries\t{total_entries}") 2331 + print(f"Total References\t{total_references}") 2332 + print(f"Outbound Refs\t{len(ref_index.outbound_refs)}") 2333 + print(f"Inbound Refs\t{len(ref_index.inbound_refs)}") 2334 + print(f"Output File\t{output_path}") 2335 + else: 2336 + table = Table(title="Reference Index Summary") 2337 + table.add_column("Metric", style="cyan") 2338 + table.add_column("Count", style="green") 2339 + 2340 + table.add_row("Total Users", str(len(users))) 2341 + table.add_row("Total Entries", str(total_entries)) 2342 + table.add_row("Total References", str(total_references)) 2343 + table.add_row("Outbound Refs", str(len(ref_index.outbound_refs))) 2344 + table.add_row("Inbound Refs", str(len(ref_index.inbound_refs))) 2345 + table.add_row("Output File", str(output_path)) 2346 + 2347 + console.print(table) 2348 + 2349 + # Show some interesting statistics 2350 + if total_references > 0: 2351 + if not get_tsv_mode(): 2352 + console.print("\n[bold]Reference Statistics:[/bold]") 2353 + 2354 + # Most referenced users 2355 + target_counts = {} 2356 + unresolved_domains = set() 2357 + 2358 + for ref in ref_index.references: 2359 + if ref.target_username: 2360 + target_counts[ref.target_username] = target_counts.get(ref.target_username, 0) + 1 2361 + else: 2362 + # Track unresolved domains 2363 + from urllib.parse import urlparse 2364 + domain = urlparse(ref.target_url).netloc.lower() 2365 + unresolved_domains.add(domain) 2366 + 2367 + if target_counts: 2368 + if get_tsv_mode(): 2369 + print("Referenced User\tReference Count") 2370 + for username, count in sorted(target_counts.items(), key=lambda x: x[1], reverse=True)[:5]: 2371 + print(f"{username}\t{count}") 2372 + else: 2373 + console.print("\nMost referenced users:") 2374 + for username, count in sorted(target_counts.items(), key=lambda x: x[1], reverse=True)[:5]: 2375 + console.print(f" {username}: {count} references") 2376 + 2377 + if unresolved_domains and verbose: 2378 + if get_tsv_mode(): 2379 + print("Unresolved Domain\tCount") 2380 + for domain in sorted(list(unresolved_domains)[:10]): 2381 + print(f"{domain}\t1") 2382 + if len(unresolved_domains) > 10: 2383 + print(f"... and {len(unresolved_domains) - 10} more\t...") 2384 + else: 2385 + console.print(f"\nUnresolved domains: {len(unresolved_domains)}") 2386 + for domain in sorted(list(unresolved_domains)[:10]): 2387 + console.print(f" {domain}") 2388 + if len(unresolved_domains) > 10: 2389 + console.print(f" ... and {len(unresolved_domains) - 10} more") 2390 + 2391 + except Exception as e: 2392 + console.print(f"[red]Error building reference index: {e}[/red]") 2393 + if verbose: 2394 + console.print_exception() 2395 + raise typer.Exit(1) 2396 + 2397 + 2398 + @app.command() 2399 + def threads( 2400 + config_file: Optional[Path] = typer.Option( 2401 + None, 2402 + "--config", 2403 + "-c", 2404 + help="Path to configuration file", 2405 + ), 2406 + index_file: Optional[Path] = typer.Option( 2407 + None, 2408 + "--index", 2409 + "-i", 2410 + help="Path to reference index file (default: links.json in git store)", 2411 + ), 2412 + username: Optional[str] = typer.Option( 2413 + None, 2414 + "--username", 2415 + "-u", 2416 + help="Show threads for specific username only", 2417 + ), 2418 + entry_id: Optional[str] = typer.Option( 2419 + None, 2420 + "--entry", 2421 + "-e", 2422 + help="Show thread for specific entry ID", 2423 + ), 2424 + min_size: int = typer.Option( 2425 + 2, 2426 + "--min-size", 2427 + "-m", 2428 + help="Minimum thread size to display", 2429 + ), 2430 + ) -> None: 2431 + """Show threaded view of related blog entries. 2432 + 2433 + This command uses the reference index to show which blog entries 2434 + are connected through cross-references, creating an email-style 2435 + threaded view of the conversation. 2436 + 2437 + Reads reference data from the unified links.json file. 2438 + """ 2439 + try: 2440 + # Load configuration 2441 + config = load_config(config_file) 2442 + 2443 + # Determine index file path 2444 + if index_file: 2445 + index_path = index_file 2446 + else: 2447 + index_path = config.git_store / "links.json" 2448 + 2449 + if not index_path.exists(): 2450 + console.print(f"[red]Links file not found: {index_path}[/red]") 2451 + console.print("Run 'thicket links' and 'thicket index' first to build the reference index") 2452 + raise typer.Exit(1) 2453 + 2454 + # Load unified data 2455 + with open(index_path) as f: 2456 + unified_data = json.load(f) 2457 + 2458 + # Check if references exist in the unified structure 2459 + if "references" not in unified_data: 2460 + console.print(f"[red]No references found in {index_path}[/red]") 2461 + console.print("Run 'thicket index' first to build the reference index") 2462 + raise typer.Exit(1) 2463 + 2464 + # Extract reference data and reconstruct ReferenceIndex 2465 + ref_index = ReferenceIndex.from_dict({ 2466 + "references": unified_data["references"], 2467 + "user_domains": unified_data.get("user_domains", {}) 2468 + }) 2469 + 2470 + # Initialize Git store to get entry details 2471 + git_store = GitStore(config.git_store) 2472 + 2473 + if entry_id and username: 2474 + # Show specific thread 2475 + thread_members = ref_index.get_thread_members(username, entry_id) 2476 + _display_thread(thread_members, ref_index, git_store, f"Thread for {username}:{entry_id}") 2477 + 2478 + elif username: 2479 + # Show all threads involving this user 2480 + user_index = git_store._load_index() 2481 + user = user_index.get_user(username) 2482 + if not user: 2483 + console.print(f"[red]User not found: {username}[/red]") 2484 + raise typer.Exit(1) 2485 + 2486 + entries = git_store.list_entries(username) 2487 + threads_found = set() 2488 + 2489 + console.print(f"[bold]Threads involving {username}:[/bold]\n") 2490 + 2491 + for entry in entries: 2492 + thread_members = ref_index.get_thread_members(username, entry.id) 2493 + if len(thread_members) >= min_size: 2494 + thread_key = tuple(sorted(thread_members)) 2495 + if thread_key not in threads_found: 2496 + threads_found.add(thread_key) 2497 + _display_thread(thread_members, ref_index, git_store, f"Thread #{len(threads_found)}") 2498 + 2499 + else: 2500 + # Show all threads 2501 + console.print("[bold]All conversation threads:[/bold]\n") 2502 + 2503 + all_threads = set() 2504 + processed_entries = set() 2505 + 2506 + # Get all entries 2507 + user_index = git_store._load_index() 2508 + for username in user_index.users.keys(): 2509 + entries = git_store.list_entries(username) 2510 + for entry in entries: 2511 + entry_key = (username, entry.id) 2512 + if entry_key in processed_entries: 2513 + continue 2514 + 2515 + thread_members = ref_index.get_thread_members(username, entry.id) 2516 + if len(thread_members) >= min_size: 2517 + thread_key = tuple(sorted(thread_members)) 2518 + if thread_key not in all_threads: 2519 + all_threads.add(thread_key) 2520 + _display_thread(thread_members, ref_index, git_store, f"Thread #{len(all_threads)}") 2521 + 2522 + # Mark all members as processed 2523 + for member in thread_members: 2524 + processed_entries.add(member) 2525 + 2526 + if not all_threads: 2527 + console.print("[yellow]No conversation threads found[/yellow]") 2528 + console.print(f"(minimum thread size: {min_size})") 2529 + 2530 + except Exception as e: 2531 + console.print(f"[red]Error showing threads: {e}[/red]") 2532 + raise typer.Exit(1) 2533 + 2534 + 2535 + def _display_thread(thread_members, ref_index, git_store, title): 2536 + """Display a single conversation thread.""" 2537 + console.print(f"[bold cyan]{title}[/bold cyan]") 2538 + console.print(f"Thread size: {len(thread_members)} entries") 2539 + 2540 + # Get entry details for each member 2541 + thread_entries = [] 2542 + for username, entry_id in thread_members: 2543 + entry = git_store.get_entry(username, entry_id) 2544 + if entry: 2545 + thread_entries.append((username, entry)) 2546 + 2547 + # Sort by publication date 2548 + thread_entries.sort(key=lambda x: x[1].published or x[1].updated) 2549 + 2550 + # Display entries 2551 + for i, (username, entry) in enumerate(thread_entries): 2552 + prefix = "├─" if i < len(thread_entries) - 1 else "└─" 2553 + 2554 + # Get references for this entry 2555 + outbound = ref_index.get_outbound_refs(username, entry.id) 2556 + inbound = ref_index.get_inbound_refs(username, entry.id) 2557 + 2558 + ref_info = "" 2559 + if outbound or inbound: 2560 + ref_info = f" ({len(outbound)} out, {len(inbound)} in)" 2561 + 2562 + console.print(f" {prefix} [{username}] {entry.title[:60]}...{ref_info}") 2563 + 2564 + if entry.published: 2565 + console.print(f" Published: {entry.published.strftime('%Y-%m-%d')}") 2566 + 2567 + console.print() # Empty line after each thread 2568 + </file> 2569 + 2570 + <file path="src/thicket/cli/commands/info_cmd.py"> 2571 + """CLI command for displaying detailed information about a specific atom entry.""" 2572 + 2573 + import json 2574 + from pathlib import Path 2575 + from typing import Optional 2576 + 2577 + import typer 2578 + from rich.console import Console 2579 + from rich.panel import Panel 2580 + from rich.table import Table 2581 + from rich.text import Text 2582 + 2583 + from ...core.git_store import GitStore 2584 + from ...core.reference_parser import ReferenceIndex 2585 + from ..main import app 2586 + from ..utils import load_config, get_tsv_mode 2587 + 2588 + console = Console() 2589 + 2590 + 2591 + @app.command() 2592 + def info( 2593 + identifier: str = typer.Argument( 2594 + ..., 2595 + help="The atom ID or URL of the entry to display information about" 2596 + ), 2597 + username: Optional[str] = typer.Option( 2598 + None, 2599 + "--username", 2600 + "-u", 2601 + help="Username to search for the entry (if not provided, searches all users)" 2602 + ), 2603 + config_file: Optional[Path] = typer.Option( 2604 + Path("thicket.yaml"), 2605 + "--config", 2606 + "-c", 2607 + help="Path to configuration file", 2608 + ), 2609 + show_content: bool = typer.Option( 2610 + False, 2611 + "--content", 2612 + help="Include the full content of the entry in the output" 2613 + ), 2614 + ) -> None: 2615 + """Display detailed information about a specific atom entry. 2616 + 2617 + You can specify the entry using either its atom ID or URL. 2618 + Shows all metadata for the given entry, including title, dates, categories, 2619 + and summarizes all inbound and outbound links to/from other posts. 2620 + """ 2621 + try: 2622 + # Load configuration 2623 + config = load_config(config_file) 2624 + 2625 + # Initialize Git store 2626 + git_store = GitStore(config.git_store) 2627 + 2628 + # Find the entry 2629 + entry = None 2630 + found_username = None 2631 + 2632 + # Check if identifier looks like a URL 2633 + is_url = identifier.startswith(('http://', 'https://')) 2634 + 2635 + if username: 2636 + # Search specific username 2637 + if is_url: 2638 + # Search by URL 2639 + entries = git_store.list_entries(username) 2640 + for e in entries: 2641 + if str(e.link) == identifier: 2642 + entry = e 2643 + found_username = username 2644 + break 2645 + else: 2646 + # Search by atom ID 2647 + entry = git_store.get_entry(username, identifier) 2648 + if entry: 2649 + found_username = username 2650 + else: 2651 + # Search all users 2652 + index = git_store._load_index() 2653 + for user in index.users.keys(): 2654 + if is_url: 2655 + # Search by URL 2656 + entries = git_store.list_entries(user) 2657 + for e in entries: 2658 + if str(e.link) == identifier: 2659 + entry = e 2660 + found_username = user 2661 + break 2662 + if entry: 2663 + break 2664 + else: 2665 + # Search by atom ID 2666 + entry = git_store.get_entry(user, identifier) 2667 + if entry: 2668 + found_username = user 2669 + break 2670 + 2671 + if not entry or not found_username: 2672 + if username: 2673 + console.print(f"[red]Entry with {'URL' if is_url else 'atom ID'} '{identifier}' not found for user '{username}'[/red]") 2674 + else: 2675 + console.print(f"[red]Entry with {'URL' if is_url else 'atom ID'} '{identifier}' not found in any user's entries[/red]") 2676 + raise typer.Exit(1) 2677 + 2678 + # Load reference index if available 2679 + links_path = config.git_store / "links.json" 2680 + ref_index = None 2681 + if links_path.exists(): 2682 + with open(links_path) as f: 2683 + unified_data = json.load(f) 2684 + 2685 + # Check if references exist in the unified structure 2686 + if "references" in unified_data: 2687 + ref_index = ReferenceIndex.from_dict({ 2688 + "references": unified_data["references"], 2689 + "user_domains": unified_data.get("user_domains", {}) 2690 + }) 2691 + 2692 + # Display information 2693 + if get_tsv_mode(): 2694 + _display_entry_info_tsv(entry, found_username, ref_index, show_content) 2695 + else: 2696 + _display_entry_info(entry, found_username) 2697 + 2698 + if ref_index: 2699 + _display_link_info(entry, found_username, ref_index) 2700 + else: 2701 + console.print("\n[yellow]No reference index found. Run 'thicket links' and 'thicket index' to build cross-reference data.[/yellow]") 2702 + 2703 + # Optionally display content 2704 + if show_content and entry.content: 2705 + _display_content(entry.content) 2706 + 2707 + except Exception as e: 2708 + console.print(f"[red]Error displaying entry info: {e}[/red]") 2709 + raise typer.Exit(1) 2710 + 2711 + 2712 + def _display_entry_info(entry, username: str) -> None: 2713 + """Display basic entry information in a structured format.""" 2714 + 2715 + # Create main info panel 2716 + info_table = Table.grid(padding=(0, 2)) 2717 + info_table.add_column("Field", style="cyan bold", width=15) 2718 + info_table.add_column("Value", style="white") 2719 + 2720 + info_table.add_row("User", f"[green]{username}[/green]") 2721 + info_table.add_row("Atom ID", f"[blue]{entry.id}[/blue]") 2722 + info_table.add_row("Title", entry.title) 2723 + info_table.add_row("Link", str(entry.link)) 2724 + 2725 + if entry.published: 2726 + info_table.add_row("Published", entry.published.strftime("%Y-%m-%d %H:%M:%S UTC")) 2727 + 2728 + info_table.add_row("Updated", entry.updated.strftime("%Y-%m-%d %H:%M:%S UTC")) 2729 + 2730 + if entry.summary: 2731 + # Truncate long summaries 2732 + summary = entry.summary[:200] + "..." if len(entry.summary) > 200 else entry.summary 2733 + info_table.add_row("Summary", summary) 2734 + 2735 + if entry.categories: 2736 + categories_text = ", ".join(entry.categories) 2737 + info_table.add_row("Categories", categories_text) 2738 + 2739 + if entry.author: 2740 + author_info = [] 2741 + if "name" in entry.author: 2742 + author_info.append(entry.author["name"]) 2743 + if "email" in entry.author: 2744 + author_info.append(f"<{entry.author['email']}>") 2745 + if author_info: 2746 + info_table.add_row("Author", " ".join(author_info)) 2747 + 2748 + if entry.content_type: 2749 + info_table.add_row("Content Type", entry.content_type) 2750 + 2751 + if entry.rights: 2752 + info_table.add_row("Rights", entry.rights) 2753 + 2754 + if entry.source: 2755 + info_table.add_row("Source Feed", entry.source) 2756 + 2757 + panel = Panel( 2758 + info_table, 2759 + title=f"[bold]Entry Information[/bold]", 2760 + border_style="blue" 2761 + ) 2762 + 2763 + console.print(panel) 2764 + 2765 + 2766 + def _display_link_info(entry, username: str, ref_index: ReferenceIndex) -> None: 2767 + """Display inbound and outbound link information.""" 2768 + 2769 + # Get links 2770 + outbound_refs = ref_index.get_outbound_refs(username, entry.id) 2771 + inbound_refs = ref_index.get_inbound_refs(username, entry.id) 2772 + 2773 + if not outbound_refs and not inbound_refs: 2774 + console.print("\n[dim]No cross-references found for this entry.[/dim]") 2775 + return 2776 + 2777 + # Create links table 2778 + links_table = Table(title="Cross-References") 2779 + links_table.add_column("Direction", style="cyan", width=10) 2780 + links_table.add_column("Target/Source", style="green", width=20) 2781 + links_table.add_column("URL", style="blue", width=50) 2782 + 2783 + # Add outbound references 2784 + for ref in outbound_refs: 2785 + target_info = f"{ref.target_username}:{ref.target_entry_id}" if ref.target_username and ref.target_entry_id else "External" 2786 + links_table.add_row("→ Out", target_info, ref.target_url) 2787 + 2788 + # Add inbound references 2789 + for ref in inbound_refs: 2790 + source_info = f"{ref.source_username}:{ref.source_entry_id}" 2791 + links_table.add_row("← In", source_info, ref.target_url) 2792 + 2793 + console.print() 2794 + console.print(links_table) 2795 + 2796 + # Summary 2797 + console.print(f"\n[bold]Summary:[/bold] {len(outbound_refs)} outbound, {len(inbound_refs)} inbound references") 2798 + 2799 + 2800 + def _display_content(content: str) -> None: 2801 + """Display the full content of the entry.""" 2802 + 2803 + # Truncate very long content 2804 + display_content = content 2805 + if len(content) > 5000: 2806 + display_content = content[:5000] + "\n\n[... content truncated ...]" 2807 + 2808 + panel = Panel( 2809 + display_content, 2810 + title="[bold]Entry Content[/bold]", 2811 + border_style="green", 2812 + expand=False 2813 + ) 2814 + 2815 + console.print() 2816 + console.print(panel) 2817 + 2818 + 2819 + def _display_entry_info_tsv(entry, username: str, ref_index: Optional[ReferenceIndex], show_content: bool) -> None: 2820 + """Display entry information in TSV format.""" 2821 + 2822 + # Basic info 2823 + print("Field\tValue") 2824 + print(f"User\t{username}") 2825 + print(f"Atom ID\t{entry.id}") 2826 + print(f"Title\t{entry.title.replace(chr(9), ' ').replace(chr(10), ' ').replace(chr(13), ' ')}") 2827 + print(f"Link\t{entry.link}") 2828 + 2829 + if entry.published: 2830 + print(f"Published\t{entry.published.strftime('%Y-%m-%d %H:%M:%S UTC')}") 2831 + 2832 + print(f"Updated\t{entry.updated.strftime('%Y-%m-%d %H:%M:%S UTC')}") 2833 + 2834 + if entry.summary: 2835 + # Escape tabs and newlines in summary 2836 + summary = entry.summary.replace('\t', ' ').replace('\n', ' ').replace('\r', ' ') 2837 + print(f"Summary\t{summary}") 2838 + 2839 + if entry.categories: 2840 + print(f"Categories\t{', '.join(entry.categories)}") 2841 + 2842 + if entry.author: 2843 + author_info = [] 2844 + if "name" in entry.author: 2845 + author_info.append(entry.author["name"]) 2846 + if "email" in entry.author: 2847 + author_info.append(f"<{entry.author['email']}>") 2848 + if author_info: 2849 + print(f"Author\t{' '.join(author_info)}") 2850 + 2851 + if entry.content_type: 2852 + print(f"Content Type\t{entry.content_type}") 2853 + 2854 + if entry.rights: 2855 + print(f"Rights\t{entry.rights}") 2856 + 2857 + if entry.source: 2858 + print(f"Source Feed\t{entry.source}") 2859 + 2860 + # Add reference info if available 2861 + if ref_index: 2862 + outbound_refs = ref_index.get_outbound_refs(username, entry.id) 2863 + inbound_refs = ref_index.get_inbound_refs(username, entry.id) 2864 + 2865 + print(f"Outbound References\t{len(outbound_refs)}") 2866 + print(f"Inbound References\t{len(inbound_refs)}") 2867 + 2868 + # Show each reference 2869 + for ref in outbound_refs: 2870 + target_info = f"{ref.target_username}:{ref.target_entry_id}" if ref.target_username and ref.target_entry_id else "External" 2871 + print(f"Outbound Reference\t{target_info}\t{ref.target_url}") 2872 + 2873 + for ref in inbound_refs: 2874 + source_info = f"{ref.source_username}:{ref.source_entry_id}" 2875 + print(f"Inbound Reference\t{source_info}\t{ref.target_url}") 2876 + 2877 + # Show content if requested 2878 + if show_content and entry.content: 2879 + # Escape tabs and newlines in content 2880 + content = entry.content.replace('\t', ' ').replace('\n', ' ').replace('\r', ' ') 2881 + print(f"Content\t{content}") 2882 + </file> 2883 + 2884 + <file path="src/thicket/cli/commands/init.py"> 2885 + """Initialize command for thicket.""" 2886 + 2887 + from pathlib import Path 2888 + from typing import Optional 2889 + 2890 + import typer 2891 + from pydantic import ValidationError 2892 + 2893 + from ...core.git_store import GitStore 2894 + from ...models import ThicketConfig 2895 + from ..main import app 2896 + from ..utils import print_error, print_success, save_config 2897 + 2898 + 2899 + @app.command() 2900 + def init( 2901 + git_store: Path = typer.Argument(..., help="Path to Git repository for storing feeds"), 2902 + cache_dir: Optional[Path] = typer.Option( 2903 + None, "--cache-dir", "-c", help="Cache directory (default: ~/.cache/thicket)" 2904 + ), 2905 + config_file: Optional[Path] = typer.Option( 2906 + None, "--config", help="Configuration file path (default: thicket.yaml)" 2907 + ), 2908 + force: bool = typer.Option( 2909 + False, "--force", "-f", help="Overwrite existing configuration" 2910 + ), 2911 + ) -> None: 2912 + """Initialize a new thicket configuration and Git store.""" 2913 + 2914 + # Set default paths 2915 + if cache_dir is None: 2916 + from platformdirs import user_cache_dir 2917 + cache_dir = Path(user_cache_dir("thicket")) 2918 + 2919 + if config_file is None: 2920 + config_file = Path("thicket.yaml") 2921 + 2922 + # Check if config already exists 2923 + if config_file.exists() and not force: 2924 + print_error(f"Configuration file already exists: {config_file}") 2925 + print_error("Use --force to overwrite") 2926 + raise typer.Exit(1) 2927 + 2928 + # Create cache directory 2929 + cache_dir.mkdir(parents=True, exist_ok=True) 2930 + 2931 + # Create Git store 2932 + try: 2933 + GitStore(git_store) 2934 + print_success(f"Initialized Git store at: {git_store}") 2935 + except Exception as e: 2936 + print_error(f"Failed to initialize Git store: {e}") 2937 + raise typer.Exit(1) from e 2938 + 2939 + # Create configuration 2940 + try: 2941 + config = ThicketConfig( 2942 + git_store=git_store, 2943 + cache_dir=cache_dir, 2944 + users=[] 2945 + ) 2946 + 2947 + save_config(config, config_file) 2948 + print_success(f"Created configuration file: {config_file}") 2949 + 2950 + except ValidationError as e: 2951 + print_error(f"Invalid configuration: {e}") 2952 + raise typer.Exit(1) from e 2953 + except Exception as e: 2954 + print_error(f"Failed to create configuration: {e}") 2955 + raise typer.Exit(1) from e 2956 + 2957 + print_success("Thicket initialized successfully!") 2958 + print_success(f"Git store: {git_store}") 2959 + print_success(f"Cache directory: {cache_dir}") 2960 + print_success(f"Configuration: {config_file}") 2961 + print_success("Run 'thicket add user' to add your first user and feed.") 2962 + </file> 2963 + 2964 + <file path="src/thicket/cli/__init__.py"> 2965 + """CLI interface for thicket.""" 2966 + 2967 + from .main import app 2968 + 2969 + __all__ = ["app"] 2970 + </file> 2971 + 2972 + <file path="src/thicket/core/__init__.py"> 2973 + """Core business logic for thicket.""" 2974 + 2975 + from .feed_parser import FeedParser 2976 + from .git_store import GitStore 2977 + 2978 + __all__ = ["FeedParser", "GitStore"] 2979 + </file> 2980 + 2981 + <file path="src/thicket/core/feed_parser.py"> 2982 + """Feed parsing and normalization with auto-discovery.""" 2983 + 2984 + from datetime import datetime 2985 + from typing import Optional 2986 + from urllib.parse import urlparse 2987 + 2988 + import bleach 2989 + import feedparser 2990 + import httpx 2991 + from pydantic import HttpUrl, ValidationError 2992 + 2993 + from ..models import AtomEntry, FeedMetadata 2994 + 2995 + 2996 + class FeedParser: 2997 + """Parser for RSS/Atom feeds with normalization and auto-discovery.""" 2998 + 2999 + def __init__(self, user_agent: str = "thicket/0.1.0"): 3000 + """Initialize the feed parser.""" 3001 + self.user_agent = user_agent 3002 + self.allowed_tags = [ 3003 + "a", "abbr", "acronym", "b", "blockquote", "br", "code", "em", 3004 + "i", "li", "ol", "p", "pre", "strong", "ul", "h1", "h2", "h3", 3005 + "h4", "h5", "h6", "img", "div", "span", 3006 + ] 3007 + self.allowed_attributes = { 3008 + "a": ["href", "title"], 3009 + "abbr": ["title"], 3010 + "acronym": ["title"], 3011 + "img": ["src", "alt", "title", "width", "height"], 3012 + "blockquote": ["cite"], 3013 + } 3014 + 3015 + async def fetch_feed(self, url: HttpUrl) -> str: 3016 + """Fetch feed content from URL.""" 3017 + async with httpx.AsyncClient() as client: 3018 + response = await client.get( 3019 + str(url), 3020 + headers={"User-Agent": self.user_agent}, 3021 + timeout=30.0, 3022 + follow_redirects=True, 3023 + ) 3024 + response.raise_for_status() 3025 + return response.text 3026 + 3027 + def parse_feed(self, content: str, source_url: Optional[HttpUrl] = None) -> tuple[FeedMetadata, list[AtomEntry]]: 3028 + """Parse feed content and return metadata and entries.""" 3029 + parsed = feedparser.parse(content) 3030 + 3031 + if parsed.bozo and parsed.bozo_exception: 3032 + # Try to continue with potentially malformed feed 3033 + pass 3034 + 3035 + # Extract feed metadata 3036 + feed_meta = self._extract_feed_metadata(parsed.feed) 3037 + 3038 + # Extract and normalize entries 3039 + entries = [] 3040 + for entry in parsed.entries: 3041 + try: 3042 + atom_entry = self._normalize_entry(entry, source_url) 3043 + entries.append(atom_entry) 3044 + except Exception as e: 3045 + # Log error but continue processing other entries 3046 + print(f"Error processing entry {getattr(entry, 'id', 'unknown')}: {e}") 3047 + continue 3048 + 3049 + return feed_meta, entries 3050 + 3051 + def _extract_feed_metadata(self, feed: feedparser.FeedParserDict) -> FeedMetadata: 3052 + """Extract metadata from feed for auto-discovery.""" 3053 + # Parse author information 3054 + author_name = None 3055 + author_email = None 3056 + author_uri = None 3057 + 3058 + if hasattr(feed, 'author_detail'): 3059 + author_name = feed.author_detail.get('name') 3060 + author_email = feed.author_detail.get('email') 3061 + author_uri = feed.author_detail.get('href') 3062 + elif hasattr(feed, 'author'): 3063 + author_name = feed.author 3064 + 3065 + # Parse managing editor for RSS feeds 3066 + if not author_email and hasattr(feed, 'managingEditor'): 3067 + author_email = feed.managingEditor 3068 + 3069 + # Parse feed link 3070 + feed_link = None 3071 + if hasattr(feed, 'link'): 3072 + try: 3073 + feed_link = HttpUrl(feed.link) 3074 + except ValidationError: 3075 + pass 3076 + 3077 + # Parse image/icon/logo 3078 + logo = None 3079 + icon = None 3080 + image_url = None 3081 + 3082 + if hasattr(feed, 'image'): 3083 + try: 3084 + image_url = HttpUrl(feed.image.get('href', feed.image.get('url', ''))) 3085 + except (ValidationError, AttributeError): 3086 + pass 3087 + 3088 + if hasattr(feed, 'icon'): 3089 + try: 3090 + icon = HttpUrl(feed.icon) 3091 + except ValidationError: 3092 + pass 3093 + 3094 + if hasattr(feed, 'logo'): 3095 + try: 3096 + logo = HttpUrl(feed.logo) 3097 + except ValidationError: 3098 + pass 3099 + 3100 + return FeedMetadata( 3101 + title=getattr(feed, 'title', None), 3102 + author_name=author_name, 3103 + author_email=author_email, 3104 + author_uri=HttpUrl(author_uri) if author_uri else None, 3105 + link=feed_link, 3106 + logo=logo, 3107 + icon=icon, 3108 + image_url=image_url, 3109 + description=getattr(feed, 'description', None), 3110 + ) 3111 + 3112 + def _normalize_entry(self, entry: feedparser.FeedParserDict, source_url: Optional[HttpUrl] = None) -> AtomEntry: 3113 + """Normalize an entry to Atom format.""" 3114 + # Parse timestamps 3115 + updated = self._parse_timestamp(entry.get('updated_parsed') or entry.get('published_parsed')) 3116 + published = self._parse_timestamp(entry.get('published_parsed')) 3117 + 3118 + # Parse content 3119 + content = self._extract_content(entry) 3120 + content_type = self._extract_content_type(entry) 3121 + 3122 + # Parse author 3123 + author = self._extract_author(entry) 3124 + 3125 + # Parse categories/tags 3126 + categories = [] 3127 + if hasattr(entry, 'tags'): 3128 + categories = [tag.get('term', '') for tag in entry.tags if tag.get('term')] 3129 + 3130 + # Sanitize HTML content 3131 + if content: 3132 + content = self._sanitize_html(content) 3133 + 3134 + summary = entry.get('summary', '') 3135 + if summary: 3136 + summary = self._sanitize_html(summary) 3137 + 3138 + return AtomEntry( 3139 + id=entry.get('id', entry.get('link', '')), 3140 + title=entry.get('title', ''), 3141 + link=HttpUrl(entry.get('link', '')), 3142 + updated=updated, 3143 + published=published, 3144 + summary=summary or None, 3145 + content=content or None, 3146 + content_type=content_type, 3147 + author=author, 3148 + categories=categories, 3149 + rights=entry.get('rights', None), 3150 + source=str(source_url) if source_url else None, 3151 + ) 3152 + 3153 + def _parse_timestamp(self, time_struct) -> datetime: 3154 + """Parse feedparser time struct to datetime.""" 3155 + if time_struct: 3156 + return datetime(*time_struct[:6]) 3157 + return datetime.now() 3158 + 3159 + def _extract_content(self, entry: feedparser.FeedParserDict) -> Optional[str]: 3160 + """Extract the best content from an entry.""" 3161 + # Prefer content over summary 3162 + if hasattr(entry, 'content') and entry.content: 3163 + # Find the best content (prefer text/html, then text/plain) 3164 + for content_item in entry.content: 3165 + if content_item.get('type') in ['text/html', 'html']: 3166 + return content_item.get('value', '') 3167 + elif content_item.get('type') in ['text/plain', 'text']: 3168 + return content_item.get('value', '') 3169 + # Fallback to first content item 3170 + return entry.content[0].get('value', '') 3171 + 3172 + # Fallback to summary 3173 + return entry.get('summary', '') 3174 + 3175 + def _extract_content_type(self, entry: feedparser.FeedParserDict) -> str: 3176 + """Extract content type from entry.""" 3177 + if hasattr(entry, 'content') and entry.content: 3178 + content_type = entry.content[0].get('type', 'html') 3179 + # Normalize content type 3180 + if content_type in ['text/html', 'html']: 3181 + return 'html' 3182 + elif content_type in ['text/plain', 'text']: 3183 + return 'text' 3184 + elif content_type == 'xhtml': 3185 + return 'xhtml' 3186 + return 'html' 3187 + 3188 + def _extract_author(self, entry: feedparser.FeedParserDict) -> Optional[dict]: 3189 + """Extract author information from entry.""" 3190 + author = {} 3191 + 3192 + if hasattr(entry, 'author_detail'): 3193 + author.update({ 3194 + 'name': entry.author_detail.get('name'), 3195 + 'email': entry.author_detail.get('email'), 3196 + 'uri': entry.author_detail.get('href'), 3197 + }) 3198 + elif hasattr(entry, 'author'): 3199 + author['name'] = entry.author 3200 + 3201 + return author if author else None 3202 + 3203 + def _sanitize_html(self, html: str) -> str: 3204 + """Sanitize HTML content to prevent XSS.""" 3205 + return bleach.clean( 3206 + html, 3207 + tags=self.allowed_tags, 3208 + attributes=self.allowed_attributes, 3209 + strip=True, 3210 + ) 3211 + 3212 + def sanitize_entry_id(self, entry_id: str) -> str: 3213 + """Sanitize entry ID to be a safe filename.""" 3214 + # Parse URL to get meaningful parts 3215 + parsed = urlparse(entry_id) 3216 + 3217 + # Start with the path component 3218 + if parsed.path: 3219 + # Remove leading slash and replace problematic characters 3220 + safe_id = parsed.path.lstrip('/').replace('/', '_').replace('\\', '_') 3221 + else: 3222 + # Use the entire ID as fallback 3223 + safe_id = entry_id 3224 + 3225 + # Replace problematic characters 3226 + safe_chars = [] 3227 + for char in safe_id: 3228 + if char.isalnum() or char in '-_.': 3229 + safe_chars.append(char) 3230 + else: 3231 + safe_chars.append('_') 3232 + 3233 + safe_id = ''.join(safe_chars) 3234 + 3235 + # Ensure it's not too long (max 200 chars) 3236 + if len(safe_id) > 200: 3237 + safe_id = safe_id[:200] 3238 + 3239 + # Ensure it's not empty 3240 + if not safe_id: 3241 + safe_id = "entry" 3242 + 3243 + return safe_id 3244 + </file> 3245 + 3246 + <file path="src/thicket/core/reference_parser.py"> 3247 + """Reference detection and parsing for blog entries.""" 3248 + 3249 + import re 3250 + from typing import Optional 3251 + from urllib.parse import urlparse 3252 + 3253 + from ..models import AtomEntry 3254 + 3255 + 3256 + class BlogReference: 3257 + """Represents a reference from one blog entry to another.""" 3258 + 3259 + def __init__( 3260 + self, 3261 + source_entry_id: str, 3262 + source_username: str, 3263 + target_url: str, 3264 + target_username: Optional[str] = None, 3265 + target_entry_id: Optional[str] = None, 3266 + ): 3267 + self.source_entry_id = source_entry_id 3268 + self.source_username = source_username 3269 + self.target_url = target_url 3270 + self.target_username = target_username 3271 + self.target_entry_id = target_entry_id 3272 + 3273 + def to_dict(self) -> dict: 3274 + """Convert to dictionary for JSON serialization.""" 3275 + result = { 3276 + "source_entry_id": self.source_entry_id, 3277 + "source_username": self.source_username, 3278 + "target_url": self.target_url, 3279 + } 3280 + 3281 + # Only include optional fields if they are not None 3282 + if self.target_username is not None: 3283 + result["target_username"] = self.target_username 3284 + if self.target_entry_id is not None: 3285 + result["target_entry_id"] = self.target_entry_id 3286 + 3287 + return result 3288 + 3289 + @classmethod 3290 + def from_dict(cls, data: dict) -> "BlogReference": 3291 + """Create from dictionary.""" 3292 + return cls( 3293 + source_entry_id=data["source_entry_id"], 3294 + source_username=data["source_username"], 3295 + target_url=data["target_url"], 3296 + target_username=data.get("target_username"), 3297 + target_entry_id=data.get("target_entry_id"), 3298 + ) 3299 + 3300 + 3301 + class ReferenceIndex: 3302 + """Index of blog-to-blog references for creating threaded views.""" 3303 + 3304 + def __init__(self): 3305 + self.references: list[BlogReference] = [] 3306 + self.outbound_refs: dict[ 3307 + str, list[BlogReference] 3308 + ] = {} # entry_id -> outbound refs 3309 + self.inbound_refs: dict[ 3310 + str, list[BlogReference] 3311 + ] = {} # entry_id -> inbound refs 3312 + self.user_domains: dict[str, set[str]] = {} # username -> set of domains 3313 + 3314 + def add_reference(self, ref: BlogReference) -> None: 3315 + """Add a reference to the index.""" 3316 + self.references.append(ref) 3317 + 3318 + # Update outbound references 3319 + source_key = f"{ref.source_username}:{ref.source_entry_id}" 3320 + if source_key not in self.outbound_refs: 3321 + self.outbound_refs[source_key] = [] 3322 + self.outbound_refs[source_key].append(ref) 3323 + 3324 + # Update inbound references if we can identify the target 3325 + if ref.target_username and ref.target_entry_id: 3326 + target_key = f"{ref.target_username}:{ref.target_entry_id}" 3327 + if target_key not in self.inbound_refs: 3328 + self.inbound_refs[target_key] = [] 3329 + self.inbound_refs[target_key].append(ref) 3330 + 3331 + def get_outbound_refs(self, username: str, entry_id: str) -> list[BlogReference]: 3332 + """Get all outbound references from an entry.""" 3333 + key = f"{username}:{entry_id}" 3334 + return self.outbound_refs.get(key, []) 3335 + 3336 + def get_inbound_refs(self, username: str, entry_id: str) -> list[BlogReference]: 3337 + """Get all inbound references to an entry.""" 3338 + key = f"{username}:{entry_id}" 3339 + return self.inbound_refs.get(key, []) 3340 + 3341 + def get_thread_members(self, username: str, entry_id: str) -> set[tuple[str, str]]: 3342 + """Get all entries that are part of the same thread.""" 3343 + visited = set() 3344 + to_visit = [(username, entry_id)] 3345 + thread_members = set() 3346 + 3347 + while to_visit: 3348 + current_user, current_entry = to_visit.pop() 3349 + if (current_user, current_entry) in visited: 3350 + continue 3351 + 3352 + visited.add((current_user, current_entry)) 3353 + thread_members.add((current_user, current_entry)) 3354 + 3355 + # Add outbound references 3356 + for ref in self.get_outbound_refs(current_user, current_entry): 3357 + if ref.target_username and ref.target_entry_id: 3358 + to_visit.append((ref.target_username, ref.target_entry_id)) 3359 + 3360 + # Add inbound references 3361 + for ref in self.get_inbound_refs(current_user, current_entry): 3362 + to_visit.append((ref.source_username, ref.source_entry_id)) 3363 + 3364 + return thread_members 3365 + 3366 + def to_dict(self) -> dict: 3367 + """Convert to dictionary for JSON serialization.""" 3368 + return { 3369 + "references": [ref.to_dict() for ref in self.references], 3370 + "user_domains": {k: list(v) for k, v in self.user_domains.items()}, 3371 + } 3372 + 3373 + @classmethod 3374 + def from_dict(cls, data: dict) -> "ReferenceIndex": 3375 + """Create from dictionary.""" 3376 + index = cls() 3377 + for ref_data in data.get("references", []): 3378 + ref = BlogReference.from_dict(ref_data) 3379 + index.add_reference(ref) 3380 + 3381 + for username, domains in data.get("user_domains", {}).items(): 3382 + index.user_domains[username] = set(domains) 3383 + 3384 + return index 3385 + 3386 + 3387 + class ReferenceParser: 3388 + """Parses blog entries to detect references to other blogs.""" 3389 + 3390 + def __init__(self): 3391 + # Common blog platforms and patterns 3392 + self.blog_patterns = [ 3393 + r"https?://[^/]+\.(?:org|com|net|io|dev|me|co\.uk)/.*", # Common blog domains 3394 + r"https?://[^/]+\.github\.io/.*", # GitHub Pages 3395 + r"https?://[^/]+\.substack\.com/.*", # Substack 3396 + r"https?://medium\.com/.*", # Medium 3397 + r"https?://[^/]+\.wordpress\.com/.*", # WordPress.com 3398 + r"https?://[^/]+\.blogspot\.com/.*", # Blogger 3399 + ] 3400 + 3401 + # Compile regex patterns 3402 + self.link_pattern = re.compile( 3403 + r'<a[^>]+href="([^"]+)"[^>]*>(.*?)</a>', re.IGNORECASE | re.DOTALL 3404 + ) 3405 + self.url_pattern = re.compile(r'https?://[^\s<>"]+') 3406 + 3407 + def extract_links_from_html(self, html_content: str) -> list[tuple[str, str]]: 3408 + """Extract all links from HTML content.""" 3409 + links = [] 3410 + 3411 + # Extract links from <a> tags 3412 + for match in self.link_pattern.finditer(html_content): 3413 + url = match.group(1) 3414 + text = re.sub( 3415 + r"<[^>]+>", "", match.group(2) 3416 + ).strip() # Remove HTML tags from link text 3417 + links.append((url, text)) 3418 + 3419 + return links 3420 + 3421 + def is_blog_url(self, url: str) -> bool: 3422 + """Check if a URL likely points to a blog post.""" 3423 + for pattern in self.blog_patterns: 3424 + if re.match(pattern, url): 3425 + return True 3426 + return False 3427 + 3428 + def _is_likely_blog_post_url(self, url: str) -> bool: 3429 + """Check if a same-domain URL likely points to a blog post (not CSS, images, etc.).""" 3430 + parsed_url = urlparse(url) 3431 + path = parsed_url.path.lower() 3432 + 3433 + # Skip obvious non-blog content 3434 + if any(path.endswith(ext) for ext in ['.css', '.js', '.png', '.jpg', '.jpeg', '.gif', '.svg', '.ico', '.pdf', '.xml', '.json']): 3435 + return False 3436 + 3437 + # Skip common non-blog paths 3438 + if any(segment in path for segment in ['/static/', '/assets/', '/css/', '/js/', '/images/', '/img/', '/media/', '/uploads/']): 3439 + return False 3440 + 3441 + # Skip fragment-only links (same page anchors) 3442 + if not path or path == '/': 3443 + return False 3444 + 3445 + # Look for positive indicators of blog posts 3446 + # Common blog post patterns: dates, slugs, post indicators 3447 + blog_indicators = [ 3448 + r'/\d{4}/', # Year in path 3449 + r'/\d{4}/\d{2}/', # Year/month in path 3450 + r'/blog/', 3451 + r'/post/', 3452 + r'/posts/', 3453 + r'/articles?/', 3454 + r'/notes?/', 3455 + r'/entries/', 3456 + r'/writing/', 3457 + ] 3458 + 3459 + for pattern in blog_indicators: 3460 + if re.search(pattern, path): 3461 + return True 3462 + 3463 + # If it has a reasonable path depth and doesn't match exclusions, likely a blog post 3464 + path_segments = [seg for seg in path.split('/') if seg] 3465 + return len(path_segments) >= 1 # At least one meaningful path segment 3466 + 3467 + def resolve_target_user( 3468 + self, url: str, user_domains: dict[str, set[str]] 3469 + ) -> Optional[str]: 3470 + """Try to resolve a URL to a known user based on domain mapping.""" 3471 + parsed_url = urlparse(url) 3472 + domain = parsed_url.netloc.lower() 3473 + 3474 + for username, domains in user_domains.items(): 3475 + if domain in domains: 3476 + return username 3477 + 3478 + return None 3479 + 3480 + def extract_references( 3481 + self, entry: AtomEntry, username: str, user_domains: dict[str, set[str]] 3482 + ) -> list[BlogReference]: 3483 + """Extract all blog references from an entry.""" 3484 + references = [] 3485 + 3486 + # Combine all text content for analysis 3487 + content_to_search = [] 3488 + if entry.content: 3489 + content_to_search.append(entry.content) 3490 + if entry.summary: 3491 + content_to_search.append(entry.summary) 3492 + 3493 + for content in content_to_search: 3494 + links = self.extract_links_from_html(content) 3495 + 3496 + for url, _link_text in links: 3497 + entry_domain = ( 3498 + urlparse(str(entry.link)).netloc.lower() if entry.link else "" 3499 + ) 3500 + link_domain = urlparse(url).netloc.lower() 3501 + 3502 + # Check if this looks like a blog URL 3503 + if not self.is_blog_url(url): 3504 + continue 3505 + 3506 + # For same-domain links, apply additional filtering to avoid non-blog content 3507 + if link_domain == entry_domain: 3508 + # Only include same-domain links that look like blog posts 3509 + if not self._is_likely_blog_post_url(url): 3510 + continue 3511 + 3512 + # Try to resolve to a known user 3513 + if link_domain == entry_domain: 3514 + # Same domain - target user is the same as source user 3515 + target_username: Optional[str] = username 3516 + else: 3517 + # Different domain - try to resolve 3518 + target_username = self.resolve_target_user(url, user_domains) 3519 + 3520 + ref = BlogReference( 3521 + source_entry_id=entry.id, 3522 + source_username=username, 3523 + target_url=url, 3524 + target_username=target_username, 3525 + target_entry_id=None, # Will be resolved later if possible 3526 + ) 3527 + 3528 + references.append(ref) 3529 + 3530 + return references 3531 + 3532 + def build_user_domain_mapping(self, git_store: "GitStore") -> dict[str, set[str]]: 3533 + """Build mapping of usernames to their known domains.""" 3534 + user_domains = {} 3535 + index = git_store._load_index() 3536 + 3537 + for username, user_metadata in index.users.items(): 3538 + domains = set() 3539 + 3540 + # Add domains from feeds 3541 + for feed_url in user_metadata.feeds: 3542 + domain = urlparse(feed_url).netloc.lower() 3543 + if domain: 3544 + domains.add(domain) 3545 + 3546 + # Add domain from homepage 3547 + if user_metadata.homepage: 3548 + domain = urlparse(str(user_metadata.homepage)).netloc.lower() 3549 + if domain: 3550 + domains.add(domain) 3551 + 3552 + user_domains[username] = domains 3553 + 3554 + return user_domains 3555 + 3556 + def _build_url_to_entry_mapping(self, git_store: "GitStore") -> dict[str, str]: 3557 + """Build a comprehensive mapping from URLs to entry IDs using git store data. 3558 + 3559 + This creates a bidirectional mapping that handles: 3560 + - Entry link URLs -> Entry IDs 3561 + - URL variations (with/without www, http/https) 3562 + - Multiple URLs pointing to the same entry 3563 + """ 3564 + url_to_entry: dict[str, str] = {} 3565 + 3566 + # Load index to get all users 3567 + index = git_store._load_index() 3568 + 3569 + for username in index.users.keys(): 3570 + entries = git_store.list_entries(username) 3571 + 3572 + for entry in entries: 3573 + if entry.link: 3574 + link_url = str(entry.link) 3575 + entry_id = entry.id 3576 + 3577 + # Map the canonical link URL 3578 + url_to_entry[link_url] = entry_id 3579 + 3580 + # Handle common URL variations 3581 + parsed = urlparse(link_url) 3582 + if parsed.netloc and parsed.path: 3583 + # Add version without www 3584 + if parsed.netloc.startswith('www.'): 3585 + no_www_url = f"{parsed.scheme}://{parsed.netloc[4:]}{parsed.path}" 3586 + if parsed.query: 3587 + no_www_url += f"?{parsed.query}" 3588 + if parsed.fragment: 3589 + no_www_url += f"#{parsed.fragment}" 3590 + url_to_entry[no_www_url] = entry_id 3591 + 3592 + # Add version with www if not present 3593 + elif not parsed.netloc.startswith('www.'): 3594 + www_url = f"{parsed.scheme}://www.{parsed.netloc}{parsed.path}" 3595 + if parsed.query: 3596 + www_url += f"?{parsed.query}" 3597 + if parsed.fragment: 3598 + www_url += f"#{parsed.fragment}" 3599 + url_to_entry[www_url] = entry_id 3600 + 3601 + # Add http/https variations 3602 + if parsed.scheme == 'https': 3603 + http_url = link_url.replace('https://', 'http://', 1) 3604 + url_to_entry[http_url] = entry_id 3605 + elif parsed.scheme == 'http': 3606 + https_url = link_url.replace('http://', 'https://', 1) 3607 + url_to_entry[https_url] = entry_id 3608 + 3609 + return url_to_entry 3610 + 3611 + def _normalize_url(self, url: str) -> str: 3612 + """Normalize URL for consistent matching. 3613 + 3614 + Handles common variations like trailing slashes, fragments, etc. 3615 + """ 3616 + parsed = urlparse(url) 3617 + 3618 + # Remove trailing slash from path 3619 + path = parsed.path.rstrip('/') if parsed.path != '/' else parsed.path 3620 + 3621 + # Reconstruct without fragment for consistent matching 3622 + normalized = f"{parsed.scheme}://{parsed.netloc}{path}" 3623 + if parsed.query: 3624 + normalized += f"?{parsed.query}" 3625 + 3626 + return normalized 3627 + 3628 + def resolve_target_entry_ids( 3629 + self, references: list[BlogReference], git_store: "GitStore" 3630 + ) -> list[BlogReference]: 3631 + """Resolve target_entry_id for references using comprehensive URL mapping.""" 3632 + resolved_refs = [] 3633 + 3634 + # Build comprehensive URL to entry ID mapping 3635 + url_to_entry = self._build_url_to_entry_mapping(git_store) 3636 + 3637 + for ref in references: 3638 + # If we already have a target_entry_id, keep the reference as-is 3639 + if ref.target_entry_id is not None: 3640 + resolved_refs.append(ref) 3641 + continue 3642 + 3643 + # If we don't have a target_username, we can't resolve it 3644 + if ref.target_username is None: 3645 + resolved_refs.append(ref) 3646 + continue 3647 + 3648 + # Try to resolve using URL mapping 3649 + resolved_entry_id = None 3650 + 3651 + # First, try exact match 3652 + if ref.target_url in url_to_entry: 3653 + resolved_entry_id = url_to_entry[ref.target_url] 3654 + else: 3655 + # Try normalized URL matching 3656 + normalized_target = self._normalize_url(ref.target_url) 3657 + if normalized_target in url_to_entry: 3658 + resolved_entry_id = url_to_entry[normalized_target] 3659 + else: 3660 + # Try URL variations 3661 + for mapped_url, entry_id in url_to_entry.items(): 3662 + if self._normalize_url(mapped_url) == normalized_target: 3663 + resolved_entry_id = entry_id 3664 + break 3665 + 3666 + # Verify the resolved entry belongs to the target username 3667 + if resolved_entry_id: 3668 + # Double-check by loading the actual entry 3669 + entries = git_store.list_entries(ref.target_username) 3670 + entry_found = any(entry.id == resolved_entry_id for entry in entries) 3671 + if not entry_found: 3672 + resolved_entry_id = None 3673 + 3674 + # Create a new reference with the resolved target_entry_id 3675 + resolved_ref = BlogReference( 3676 + source_entry_id=ref.source_entry_id, 3677 + source_username=ref.source_username, 3678 + target_url=ref.target_url, 3679 + target_username=ref.target_username, 3680 + target_entry_id=resolved_entry_id, 3681 + ) 3682 + resolved_refs.append(resolved_ref) 3683 + 3684 + return resolved_refs 3685 + </file> 3686 + 3687 + <file path="src/thicket/models/__init__.py"> 3688 + """Data models for thicket.""" 3689 + 3690 + from .config import ThicketConfig, UserConfig 3691 + from .feed import AtomEntry, DuplicateMap, FeedMetadata 3692 + from .user import GitStoreIndex, UserMetadata 3693 + 3694 + __all__ = [ 3695 + "ThicketConfig", 3696 + "UserConfig", 3697 + "AtomEntry", 3698 + "DuplicateMap", 3699 + "FeedMetadata", 3700 + "GitStoreIndex", 3701 + "UserMetadata", 3702 + ] 3703 + </file> 3704 + 3705 + <file path="src/thicket/models/feed.py"> 3706 + """Feed and entry models for thicket.""" 3707 + 3708 + from datetime import datetime 3709 + from typing import TYPE_CHECKING, Optional 3710 + 3711 + from pydantic import BaseModel, ConfigDict, EmailStr, HttpUrl 3712 + 3713 + if TYPE_CHECKING: 3714 + from .config import UserConfig 3715 + 3716 + 3717 + class AtomEntry(BaseModel): 3718 + """Represents an Atom feed entry stored in the Git repository.""" 3719 + 3720 + model_config = ConfigDict( 3721 + json_encoders={datetime: lambda v: v.isoformat()}, 3722 + str_strip_whitespace=True, 3723 + ) 3724 + 3725 + id: str # Original Atom ID 3726 + title: str 3727 + link: HttpUrl 3728 + updated: datetime 3729 + published: Optional[datetime] = None 3730 + summary: Optional[str] = None 3731 + content: Optional[str] = None # Full body content from Atom entry 3732 + content_type: Optional[str] = "html" # text, html, xhtml 3733 + author: Optional[dict] = None 3734 + categories: list[str] = [] 3735 + rights: Optional[str] = None # Copyright info 3736 + source: Optional[str] = None # Source feed URL 3737 + 3738 + 3739 + class FeedMetadata(BaseModel): 3740 + """Metadata extracted from a feed for auto-discovery.""" 3741 + 3742 + title: Optional[str] = None 3743 + author_name: Optional[str] = None 3744 + author_email: Optional[EmailStr] = None 3745 + author_uri: Optional[HttpUrl] = None 3746 + link: Optional[HttpUrl] = None 3747 + logo: Optional[HttpUrl] = None 3748 + icon: Optional[HttpUrl] = None 3749 + image_url: Optional[HttpUrl] = None 3750 + description: Optional[str] = None 3751 + 3752 + def to_user_config(self, username: str, feed_url: HttpUrl) -> "UserConfig": 3753 + """Convert discovered metadata to UserConfig with fallbacks.""" 3754 + from .config import UserConfig 3755 + 3756 + return UserConfig( 3757 + username=username, 3758 + feeds=[feed_url], 3759 + display_name=self.author_name or self.title, 3760 + email=self.author_email, 3761 + homepage=self.author_uri or self.link, 3762 + icon=self.logo or self.icon or self.image_url, 3763 + ) 3764 + 3765 + 3766 + class DuplicateMap(BaseModel): 3767 + """Maps duplicate entry IDs to canonical entry IDs.""" 3768 + 3769 + duplicates: dict[str, str] = {} # duplicate_id -> canonical_id 3770 + comment: str = "Entry IDs that map to the same canonical content" 3771 + 3772 + def add_duplicate(self, duplicate_id: str, canonical_id: str) -> None: 3773 + """Add a duplicate mapping.""" 3774 + self.duplicates[duplicate_id] = canonical_id 3775 + 3776 + def remove_duplicate(self, duplicate_id: str) -> bool: 3777 + """Remove a duplicate mapping. Returns True if existed.""" 3778 + return self.duplicates.pop(duplicate_id, None) is not None 3779 + 3780 + def get_canonical(self, entry_id: str) -> str: 3781 + """Get canonical ID for an entry (returns original if not duplicate).""" 3782 + return self.duplicates.get(entry_id, entry_id) 3783 + 3784 + def is_duplicate(self, entry_id: str) -> bool: 3785 + """Check if entry ID is marked as duplicate.""" 3786 + return entry_id in self.duplicates 3787 + 3788 + def get_duplicates_for_canonical(self, canonical_id: str) -> list[str]: 3789 + """Get all duplicate IDs that map to a canonical ID.""" 3790 + return [ 3791 + duplicate_id 3792 + for duplicate_id, canonical in self.duplicates.items() 3793 + if canonical == canonical_id 3794 + ] 3795 + </file> 3796 + 3797 + <file path="src/thicket/models/user.py"> 3798 + """User metadata models for thicket.""" 3799 + 3800 + from datetime import datetime 3801 + from typing import Optional 3802 + 3803 + from pydantic import BaseModel, ConfigDict 3804 + 3805 + 3806 + class UserMetadata(BaseModel): 3807 + """Metadata about a user stored in the Git repository.""" 3808 + 3809 + model_config = ConfigDict( 3810 + json_encoders={datetime: lambda v: v.isoformat()}, 3811 + str_strip_whitespace=True, 3812 + ) 3813 + 3814 + username: str 3815 + display_name: Optional[str] = None 3816 + email: Optional[str] = None 3817 + homepage: Optional[str] = None 3818 + icon: Optional[str] = None 3819 + feeds: list[str] = [] 3820 + directory: str # Directory name in Git store 3821 + created: datetime 3822 + last_updated: datetime 3823 + entry_count: int = 0 3824 + 3825 + def update_timestamp(self) -> None: 3826 + """Update the last_updated timestamp to now.""" 3827 + self.last_updated = datetime.now() 3828 + 3829 + def increment_entry_count(self, count: int = 1) -> None: 3830 + """Increment the entry count by the given amount.""" 3831 + self.entry_count += count 3832 + self.update_timestamp() 3833 + 3834 + 3835 + class GitStoreIndex(BaseModel): 3836 + """Index of all users and their directories in the Git store.""" 3837 + 3838 + model_config = ConfigDict( 3839 + json_encoders={datetime: lambda v: v.isoformat()} 3840 + ) 3841 + 3842 + users: dict[str, UserMetadata] = {} # username -> UserMetadata 3843 + created: datetime 3844 + last_updated: datetime 3845 + total_entries: int = 0 3846 + 3847 + def add_user(self, user_metadata: UserMetadata) -> None: 3848 + """Add or update a user in the index.""" 3849 + self.users[user_metadata.username] = user_metadata 3850 + self.last_updated = datetime.now() 3851 + 3852 + def remove_user(self, username: str) -> bool: 3853 + """Remove a user from the index. Returns True if user existed.""" 3854 + if username in self.users: 3855 + del self.users[username] 3856 + self.last_updated = datetime.now() 3857 + return True 3858 + return False 3859 + 3860 + def get_user(self, username: str) -> Optional[UserMetadata]: 3861 + """Get user metadata by username.""" 3862 + return self.users.get(username) 3863 + 3864 + def update_entry_count(self, username: str, count: int) -> None: 3865 + """Update entry count for a user and total.""" 3866 + user = self.get_user(username) 3867 + if user: 3868 + user.increment_entry_count(count) 3869 + self.total_entries += count 3870 + self.last_updated = datetime.now() 3871 + 3872 + def recalculate_totals(self) -> None: 3873 + """Recalculate total entries from all users.""" 3874 + self.total_entries = sum(user.entry_count for user in self.users.values()) 3875 + self.last_updated = datetime.now() 3876 + </file> 3877 + 3878 + <file path="src/thicket/utils/__init__.py"> 3879 + """Utility modules for thicket.""" 3880 + 3881 + # This module will contain shared utilities 3882 + # For now, it's empty but can be expanded with common functions 3883 + </file> 3884 + 3885 + <file path="src/thicket/__init__.py"> 3886 + """Thicket: A CLI tool for persisting Atom/RSS feeds in Git repositories.""" 3887 + 3888 + __version__ = "0.1.0" 3889 + __author__ = "thicket" 3890 + __email__ = "thicket@example.com" 3891 + </file> 3892 + 3893 + <file path="src/thicket/__main__.py"> 3894 + """Entry point for running thicket as a module.""" 3895 + 3896 + from .cli.main import app 3897 + 3898 + if __name__ == "__main__": 3899 + app() 3900 + </file> 3901 + 3902 + <file path=".gitignore"> 3903 + # Byte-compiled / optimized / DLL files 3904 + __pycache__/ 3905 + *.py[codz] 3906 + *$py.class 3907 + 3908 + # C extensions 3909 + *.so 3910 + 3911 + # Distribution / packaging 3912 + .Python 3913 + build/ 3914 + develop-eggs/ 3915 + dist/ 3916 + downloads/ 3917 + eggs/ 3918 + .eggs/ 3919 + lib/ 3920 + lib64/ 3921 + parts/ 3922 + sdist/ 3923 + var/ 3924 + wheels/ 3925 + share/python-wheels/ 3926 + *.egg-info/ 3927 + .installed.cfg 3928 + *.egg 3929 + MANIFEST 3930 + 3931 + # PyInstaller 3932 + # Usually these files are written by a python script from a template 3933 + # before PyInstaller builds the exe, so as to inject date/other infos into it. 3934 + *.manifest 3935 + *.spec 3936 + 3937 + # Installer logs 3938 + pip-log.txt 3939 + pip-delete-this-directory.txt 3940 + 3941 + # Unit test / coverage reports 3942 + htmlcov/ 3943 + .tox/ 3944 + .nox/ 3945 + .coverage 3946 + .coverage.* 3947 + .cache 3948 + nosetests.xml 3949 + coverage.xml 3950 + *.cover 3951 + *.py.cover 3952 + .hypothesis/ 3953 + .pytest_cache/ 3954 + cover/ 3955 + 3956 + # Translations 3957 + *.mo 3958 + *.pot 3959 + 3960 + # Django stuff: 3961 + *.log 3962 + local_settings.py 3963 + db.sqlite3 3964 + db.sqlite3-journal 3965 + 3966 + # Flask stuff: 3967 + instance/ 3968 + .webassets-cache 3969 + 3970 + # Scrapy stuff: 3971 + .scrapy 3972 + 3973 + # Sphinx documentation 3974 + docs/_build/ 3975 + 3976 + # PyBuilder 3977 + .pybuilder/ 3978 + target/ 3979 + 3980 + # Jupyter Notebook 3981 + .ipynb_checkpoints 3982 + 3983 + # IPython 3984 + profile_default/ 3985 + ipython_config.py 3986 + 3987 + # pyenv 3988 + # For a library or package, you might want to ignore these files since the code is 3989 + # intended to run in multiple environments; otherwise, check them in: 3990 + # .python-version 3991 + 3992 + # pipenv 3993 + # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 3994 + # However, in case of collaboration, if having platform-specific dependencies or dependencies 3995 + # having no cross-platform support, pipenv may install dependencies that don't work, or not 3996 + # install all needed dependencies. 3997 + #Pipfile.lock 3998 + 3999 + # UV 4000 + # Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control. 4001 + # This is especially recommended for binary packages to ensure reproducibility, and is more 4002 + # commonly ignored for libraries. 4003 + #uv.lock 4004 + 4005 + # poetry 4006 + # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. 4007 + # This is especially recommended for binary packages to ensure reproducibility, and is more 4008 + # commonly ignored for libraries. 4009 + # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control 4010 + #poetry.lock 4011 + #poetry.toml 4012 + 4013 + # pdm 4014 + # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. 4015 + # pdm recommends including project-wide configuration in pdm.toml, but excluding .pdm-python. 4016 + # https://pdm-project.org/en/latest/usage/project/#working-with-version-control 4017 + #pdm.lock 4018 + #pdm.toml 4019 + .pdm-python 4020 + .pdm-build/ 4021 + 4022 + # pixi 4023 + # Similar to Pipfile.lock, it is generally recommended to include pixi.lock in version control. 4024 + #pixi.lock 4025 + # Pixi creates a virtual environment in the .pixi directory, just like venv module creates one 4026 + # in the .venv directory. It is recommended not to include this directory in version control. 4027 + .pixi 4028 + 4029 + # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm 4030 + __pypackages__/ 4031 + 4032 + # Celery stuff 4033 + celerybeat-schedule 4034 + celerybeat.pid 4035 + 4036 + # SageMath parsed files 4037 + *.sage.py 4038 + 4039 + # Environments 4040 + .env 4041 + .envrc 4042 + .venv 4043 + env/ 4044 + venv/ 4045 + ENV/ 4046 + env.bak/ 4047 + venv.bak/ 4048 + 4049 + # Spyder project settings 4050 + .spyderproject 4051 + .spyproject 4052 + 4053 + # Rope project settings 4054 + .ropeproject 4055 + 4056 + # mkdocs documentation 4057 + /site 4058 + 4059 + # mypy 4060 + .mypy_cache/ 4061 + .dmypy.json 4062 + dmypy.json 4063 + 4064 + # Pyre type checker 4065 + .pyre/ 4066 + 4067 + # pytype static type analyzer 4068 + .pytype/ 4069 + 4070 + # Cython debug symbols 4071 + cython_debug/ 4072 + 4073 + # PyCharm 4074 + # JetBrains specific template is maintained in a separate JetBrains.gitignore that can 4075 + # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore 4076 + # and can be added to the global gitignore or merged into this file. For a more nuclear 4077 + # option (not recommended) you can uncomment the following to ignore the entire idea folder. 4078 + #.idea/ 4079 + 4080 + # Abstra 4081 + # Abstra is an AI-powered process automation framework. 4082 + # Ignore directories containing user credentials, local state, and settings. 4083 + # Learn more at https://abstra.io/docs 4084 + .abstra/ 4085 + 4086 + # Visual Studio Code 4087 + # Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore 4088 + # that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore 4089 + # and can be added to the global gitignore or merged into this file. However, if you prefer, 4090 + # you could uncomment the following to ignore the entire vscode folder 4091 + # .vscode/ 4092 + 4093 + # Ruff stuff: 4094 + .ruff_cache/ 4095 + 4096 + # PyPI configuration file 4097 + .pypirc 4098 + 4099 + # Marimo 4100 + marimo/_static/ 4101 + marimo/_lsp/ 4102 + __marimo__/ 4103 + 4104 + # Streamlit 4105 + .streamlit/secrets.toml 4106 + 4107 + thicket.yaml 4108 + </file> 4109 + 4110 + <file path="CLAUDE.md"> 4111 + My goal is to build a CLI tool called thicket in Python that maintains a Git repository within which Atom feeds can be persisted, including their contents. 4112 + 4113 + # Python Environment and Package Management 4114 + 4115 + This project uses `uv` for Python package management and virtual environment handling. 4116 + 4117 + ## Running Commands 4118 + 4119 + ALWAYS use `uv run` to execute Python commands: 4120 + 4121 + - Run the CLI: `uv run -m thicket` 4122 + - Run tests: `uv run pytest` 4123 + - Type checking: `uv run mypy src/` 4124 + - Linting: `uv run ruff check src/` 4125 + - Format code: `uv run ruff format src/` 4126 + - Compile check: `uv run python -m py_compile <file>` 4127 + 4128 + ## Package Management 4129 + 4130 + - Add dependencies: `uv add <package>` 4131 + - Add dev dependencies: `uv add --dev <package>` 4132 + - Install dependencies: `uv sync` 4133 + - Update dependencies: `uv lock --upgrade` 4134 + 4135 + # Project Structure 4136 + 4137 + The configuration file specifies: 4138 + - the location of a git store 4139 + - a list of usernames and target Atom/RSS feed(s) and optional metadata about the username such as their email, homepage, icon and display name 4140 + - a cache directory to store temporary results such as feed downloads and their last modification date that speed up operations across runs of the tool 4141 + 4142 + The Git data store should: 4143 + - have a subdirectory per user 4144 + - within that directory, an entry per Atom entry indexed by the Atom id for that entry. The id should be sanitised consistently to be a safe filename. RSS feed should be normalized to Atom before storing it. 4145 + - within each entry file, the metadata of the Atom feed converted into a JSON format that preserves as much metadata as possible. 4146 + - have a JSON file in the Git repository that indexes the users, their associated directories within the Git repository, and any other metadata about that user from the config file 4147 + The CLI should be modern and use cool progress bars and any otfrom ecosystem libraries. 4148 + 4149 + The intention behind the Git repository is that it can be queried by other websites in order to build a webblog structure of comments that link to other blogs. 4150 + </file> 4151 + 4152 + <file path="pyproject.toml"> 4153 + [build-system] 4154 + requires = ["hatchling"] 4155 + build-backend = "hatchling.build" 4156 + 4157 + [project] 4158 + name = "thicket" 4159 + dynamic = ["version"] 4160 + description = "A CLI tool for persisting Atom/RSS feeds in Git repositories" 4161 + readme = "README.md" 4162 + license = "MIT" 4163 + requires-python = ">=3.9" 4164 + authors = [ 4165 + {name = "thicket", email = "thicket@example.com"}, 4166 + ] 4167 + classifiers = [ 4168 + "Development Status :: 3 - Alpha", 4169 + "Intended Audience :: Developers", 4170 + "License :: OSI Approved :: MIT License", 4171 + "Operating System :: OS Independent", 4172 + "Programming Language :: Python :: 3", 4173 + "Programming Language :: Python :: 3.9", 4174 + "Programming Language :: Python :: 3.10", 4175 + "Programming Language :: Python :: 3.11", 4176 + "Programming Language :: Python :: 3.12", 4177 + "Programming Language :: Python :: 3.13", 4178 + "Topic :: Internet :: WWW/HTTP :: Dynamic Content :: News/Diary", 4179 + "Topic :: Software Development :: Version Control :: Git", 4180 + "Topic :: Text Processing :: Markup :: XML", 4181 + ] 4182 + dependencies = [ 4183 + "typer>=0.15.0", 4184 + "rich>=13.0.0", 4185 + "GitPython>=3.1.40", 4186 + "feedparser>=6.0.11", 4187 + "pydantic>=2.11.0", 4188 + "pydantic-settings>=2.10.0", 4189 + "httpx>=0.28.0", 4190 + "pendulum>=3.0.0", 4191 + "bleach>=6.0.0", 4192 + "platformdirs>=4.0.0", 4193 + "pyyaml>=6.0.0", 4194 + "email_validator", 4195 + "jinja2>=3.1.6", 4196 + ] 4197 + 4198 + [project.optional-dependencies] 4199 + dev = [ 4200 + "pytest>=8.0.0", 4201 + "pytest-asyncio>=0.24.0", 4202 + "pytest-cov>=6.0.0", 4203 + "black>=24.0.0", 4204 + "ruff>=0.8.0", 4205 + "mypy>=1.13.0", 4206 + "types-PyYAML>=6.0.0", 4207 + ] 4208 + 4209 + [project.urls] 4210 + Homepage = "https://github.com/example/thicket" 4211 + Documentation = "https://github.com/example/thicket" 4212 + Repository = "https://github.com/example/thicket" 4213 + "Bug Tracker" = "https://github.com/example/thicket/issues" 4214 + 4215 + [project.scripts] 4216 + thicket = "thicket.cli.main:app" 4217 + 4218 + [tool.hatch.version] 4219 + path = "src/thicket/__init__.py" 4220 + 4221 + [tool.hatch.build.targets.wheel] 4222 + packages = ["src/thicket"] 4223 + 4224 + [tool.black] 4225 + line-length = 88 4226 + target-version = ['py39'] 4227 + include = '\.pyi?$' 4228 + extend-exclude = ''' 4229 + /( 4230 + # directories 4231 + \.eggs 4232 + | \.git 4233 + | \.hg 4234 + | \.mypy_cache 4235 + | \.tox 4236 + | \.venv 4237 + | build 4238 + | dist 4239 + )/ 4240 + ''' 4241 + 4242 + [tool.ruff] 4243 + target-version = "py39" 4244 + line-length = 88 4245 + 4246 + [tool.ruff.lint] 4247 + select = [ 4248 + "E", # pycodestyle errors 4249 + "W", # pycodestyle warnings 4250 + "F", # pyflakes 4251 + "I", # isort 4252 + "B", # flake8-bugbear 4253 + "C4", # flake8-comprehensions 4254 + "UP", # pyupgrade 4255 + ] 4256 + ignore = [ 4257 + "E501", # line too long, handled by black 4258 + "B008", # do not perform function calls in argument defaults 4259 + "C901", # too complex 4260 + ] 4261 + 4262 + [tool.ruff.lint.per-file-ignores] 4263 + "__init__.py" = ["F401"] 4264 + 4265 + [tool.mypy] 4266 + python_version = "3.9" 4267 + check_untyped_defs = true 4268 + disallow_any_generics = true 4269 + disallow_incomplete_defs = true 4270 + disallow_untyped_defs = true 4271 + no_implicit_optional = true 4272 + warn_redundant_casts = true 4273 + warn_unused_ignores = true 4274 + warn_return_any = true 4275 + strict_optional = true 4276 + 4277 + [[tool.mypy.overrides]] 4278 + module = [ 4279 + "feedparser", 4280 + "git", 4281 + "bleach", 4282 + ] 4283 + ignore_missing_imports = true 4284 + 4285 + [tool.pytest.ini_options] 4286 + testpaths = ["tests"] 4287 + python_files = ["test_*.py"] 4288 + python_classes = ["Test*"] 4289 + python_functions = ["test_*"] 4290 + addopts = [ 4291 + "-ra", 4292 + "--strict-markers", 4293 + "--strict-config", 4294 + "--cov=src/thicket", 4295 + "--cov-report=term-missing", 4296 + "--cov-report=html", 4297 + "--cov-report=xml", 4298 + ] 4299 + filterwarnings = [ 4300 + "error", 4301 + "ignore::UserWarning", 4302 + "ignore::DeprecationWarning", 4303 + ] 4304 + markers = [ 4305 + "slow: marks tests as slow (deselect with '-m \"not slow\"')", 4306 + "integration: marks tests as integration tests", 4307 + ] 4308 + 4309 + [tool.coverage.run] 4310 + source = ["src"] 4311 + branch = true 4312 + 4313 + [tool.coverage.report] 4314 + exclude_lines = [ 4315 + "pragma: no cover", 4316 + "def __repr__", 4317 + "if self.debug:", 4318 + "if settings.DEBUG", 4319 + "raise AssertionError", 4320 + "raise NotImplementedError", 4321 + "if 0:", 4322 + "if __name__ == .__main__.:", 4323 + "class .*\\bProtocol\\):", 4324 + "@(abc\\.)?abstractmethod", 4325 + ] 4326 + </file> 4327 + 4328 + <file path="src/thicket/cli/commands/__init__.py"> 4329 + """CLI commands for thicket.""" 4330 + 4331 + # Import all commands to register them with the main app 4332 + from . import add, duplicates, generate, index_cmd, info_cmd, init, links_cmd, list_cmd, sync 4333 + 4334 + __all__ = ["add", "duplicates", "generate", "index_cmd", "info_cmd", "init", "links_cmd", "list_cmd", "sync"] 4335 + </file> 4336 + 4337 + <file path="src/thicket/cli/commands/add.py"> 4338 + """Add command for thicket.""" 4339 + 4340 + import asyncio 4341 + from pathlib import Path 4342 + from typing import Optional 4343 + 4344 + import typer 4345 + from pydantic import HttpUrl, ValidationError 4346 + 4347 + from ...core.feed_parser import FeedParser 4348 + from ...core.git_store import GitStore 4349 + from ..main import app 4350 + from ..utils import ( 4351 + create_progress, 4352 + load_config, 4353 + print_error, 4354 + print_info, 4355 + print_success, 4356 + ) 4357 + 4358 + 4359 + @app.command("add") 4360 + def add_command( 4361 + subcommand: str = typer.Argument(..., help="Subcommand: 'user' or 'feed'"), 4362 + username: str = typer.Argument(..., help="Username"), 4363 + feed_url: Optional[str] = typer.Argument(None, help="Feed URL (required for 'user' command)"), 4364 + email: Optional[str] = typer.Option(None, "--email", "-e", help="User email"), 4365 + homepage: Optional[str] = typer.Option(None, "--homepage", "-h", help="User homepage"), 4366 + icon: Optional[str] = typer.Option(None, "--icon", "-i", help="User icon URL"), 4367 + display_name: Optional[str] = typer.Option(None, "--display-name", "-d", help="User display name"), 4368 + config_file: Optional[Path] = typer.Option( 4369 + Path("thicket.yaml"), "--config", help="Configuration file path" 4370 + ), 4371 + auto_discover: bool = typer.Option( 4372 + True, "--auto-discover/--no-auto-discover", help="Auto-discover user metadata from feed" 4373 + ), 4374 + ) -> None: 4375 + """Add a user or feed to thicket.""" 4376 + 4377 + if subcommand == "user": 4378 + add_user(username, feed_url, email, homepage, icon, display_name, config_file, auto_discover) 4379 + elif subcommand == "feed": 4380 + add_feed(username, feed_url, config_file) 4381 + else: 4382 + print_error(f"Unknown subcommand: {subcommand}") 4383 + print_error("Use 'user' or 'feed'") 4384 + raise typer.Exit(1) 4385 + 4386 + 4387 + def add_user( 4388 + username: str, 4389 + feed_url: Optional[str], 4390 + email: Optional[str], 4391 + homepage: Optional[str], 4392 + icon: Optional[str], 4393 + display_name: Optional[str], 4394 + config_file: Path, 4395 + auto_discover: bool, 4396 + ) -> None: 4397 + """Add a new user with feed.""" 4398 + 4399 + if not feed_url: 4400 + print_error("Feed URL is required when adding a user") 4401 + raise typer.Exit(1) 4402 + 4403 + # Validate feed URL 4404 + try: 4405 + validated_feed_url = HttpUrl(feed_url) 4406 + except ValidationError: 4407 + print_error(f"Invalid feed URL: {feed_url}") 4408 + raise typer.Exit(1) from None 4409 + 4410 + # Load configuration 4411 + config = load_config(config_file) 4412 + 4413 + # Initialize Git store 4414 + git_store = GitStore(config.git_store) 4415 + 4416 + # Check if user already exists 4417 + existing_user = git_store.get_user(username) 4418 + if existing_user: 4419 + print_error(f"User '{username}' already exists") 4420 + print_error("Use 'thicket add feed' to add additional feeds") 4421 + raise typer.Exit(1) 4422 + 4423 + # Auto-discover metadata if enabled 4424 + discovered_metadata = None 4425 + if auto_discover: 4426 + discovered_metadata = asyncio.run(discover_feed_metadata(validated_feed_url)) 4427 + 4428 + # Prepare user data with manual overrides taking precedence 4429 + user_display_name = display_name or (discovered_metadata.author_name or discovered_metadata.title if discovered_metadata else None) 4430 + user_email = email or (discovered_metadata.author_email if discovered_metadata else None) 4431 + user_homepage = homepage or (str(discovered_metadata.author_uri or discovered_metadata.link) if discovered_metadata else None) 4432 + user_icon = icon or (str(discovered_metadata.logo or discovered_metadata.icon or discovered_metadata.image_url) if discovered_metadata else None) 4433 + 4434 + # Add user to Git store 4435 + git_store.add_user( 4436 + username=username, 4437 + display_name=user_display_name, 4438 + email=user_email, 4439 + homepage=user_homepage, 4440 + icon=user_icon, 4441 + feeds=[str(validated_feed_url)], 4442 + ) 4443 + 4444 + # Commit changes 4445 + git_store.commit_changes(f"Add user: {username}") 4446 + 4447 + print_success(f"Added user '{username}' with feed: {feed_url}") 4448 + 4449 + if discovered_metadata and auto_discover: 4450 + print_info("Auto-discovered metadata:") 4451 + if user_display_name: 4452 + print_info(f" Display name: {user_display_name}") 4453 + if user_email: 4454 + print_info(f" Email: {user_email}") 4455 + if user_homepage: 4456 + print_info(f" Homepage: {user_homepage}") 4457 + if user_icon: 4458 + print_info(f" Icon: {user_icon}") 4459 + 4460 + 4461 + def add_feed(username: str, feed_url: Optional[str], config_file: Path) -> None: 4462 + """Add a feed to an existing user.""" 4463 + 4464 + if not feed_url: 4465 + print_error("Feed URL is required") 4466 + raise typer.Exit(1) 4467 + 4468 + # Validate feed URL 4469 + try: 4470 + validated_feed_url = HttpUrl(feed_url) 4471 + except ValidationError: 4472 + print_error(f"Invalid feed URL: {feed_url}") 4473 + raise typer.Exit(1) from None 4474 + 4475 + # Load configuration 4476 + config = load_config(config_file) 4477 + 4478 + # Initialize Git store 4479 + git_store = GitStore(config.git_store) 4480 + 4481 + # Check if user exists 4482 + user = git_store.get_user(username) 4483 + if not user: 4484 + print_error(f"User '{username}' not found") 4485 + print_error("Use 'thicket add user' to add a new user") 4486 + raise typer.Exit(1) 4487 + 4488 + # Check if feed already exists 4489 + if str(validated_feed_url) in user.feeds: 4490 + print_error(f"Feed already exists for user '{username}': {feed_url}") 4491 + raise typer.Exit(1) 4492 + 4493 + # Add feed to user 4494 + updated_feeds = user.feeds + [str(validated_feed_url)] 4495 + if git_store.update_user(username, feeds=updated_feeds): 4496 + git_store.commit_changes(f"Add feed to user {username}: {feed_url}") 4497 + print_success(f"Added feed to user '{username}': {feed_url}") 4498 + else: 4499 + print_error(f"Failed to add feed to user '{username}'") 4500 + raise typer.Exit(1) 4501 + 4502 + 4503 + async def discover_feed_metadata(feed_url: HttpUrl): 4504 + """Discover metadata from a feed URL.""" 4505 + try: 4506 + with create_progress() as progress: 4507 + task = progress.add_task("Discovering feed metadata...", total=None) 4508 + 4509 + parser = FeedParser() 4510 + content = await parser.fetch_feed(feed_url) 4511 + metadata, _ = parser.parse_feed(content, feed_url) 4512 + 4513 + progress.update(task, completed=True) 4514 + return metadata 4515 + 4516 + except Exception as e: 4517 + print_error(f"Failed to discover feed metadata: {e}") 4518 + return None 4519 + </file> 4520 + 4521 + <file path="src/thicket/cli/commands/duplicates.py"> 4522 + """Duplicates command for thicket.""" 4523 + 4524 + from pathlib import Path 4525 + from typing import Optional 4526 + 4527 + import typer 4528 + from rich.table import Table 4529 + 4530 + from ...core.git_store import GitStore 4531 + from ..main import app 4532 + from ..utils import ( 4533 + console, 4534 + load_config, 4535 + print_error, 4536 + print_info, 4537 + print_success, 4538 + get_tsv_mode, 4539 + ) 4540 + 4541 + 4542 + @app.command("duplicates") 4543 + def duplicates_command( 4544 + action: str = typer.Argument(..., help="Action: 'list', 'add', 'remove'"), 4545 + duplicate_id: Optional[str] = typer.Argument(None, help="Duplicate entry ID"), 4546 + canonical_id: Optional[str] = typer.Argument(None, help="Canonical entry ID"), 4547 + config_file: Optional[Path] = typer.Option( 4548 + Path("thicket.yaml"), "--config", help="Configuration file path" 4549 + ), 4550 + ) -> None: 4551 + """Manage duplicate entry mappings.""" 4552 + 4553 + # Load configuration 4554 + config = load_config(config_file) 4555 + 4556 + # Initialize Git store 4557 + git_store = GitStore(config.git_store) 4558 + 4559 + if action == "list": 4560 + list_duplicates(git_store) 4561 + elif action == "add": 4562 + add_duplicate(git_store, duplicate_id, canonical_id) 4563 + elif action == "remove": 4564 + remove_duplicate(git_store, duplicate_id) 4565 + else: 4566 + print_error(f"Unknown action: {action}") 4567 + print_error("Use 'list', 'add', or 'remove'") 4568 + raise typer.Exit(1) 4569 + 4570 + 4571 + def list_duplicates(git_store: GitStore) -> None: 4572 + """List all duplicate mappings.""" 4573 + duplicates = git_store.get_duplicates() 4574 + 4575 + if not duplicates.duplicates: 4576 + if get_tsv_mode(): 4577 + print("No duplicate mappings found") 4578 + else: 4579 + print_info("No duplicate mappings found") 4580 + return 4581 + 4582 + if get_tsv_mode(): 4583 + print("Duplicate ID\tCanonical ID") 4584 + for duplicate_id, canonical_id in duplicates.duplicates.items(): 4585 + print(f"{duplicate_id}\t{canonical_id}") 4586 + print(f"Total duplicates: {len(duplicates.duplicates)}") 4587 + else: 4588 + table = Table(title="Duplicate Entry Mappings") 4589 + table.add_column("Duplicate ID", style="red") 4590 + table.add_column("Canonical ID", style="green") 4591 + 4592 + for duplicate_id, canonical_id in duplicates.duplicates.items(): 4593 + table.add_row(duplicate_id, canonical_id) 4594 + 4595 + console.print(table) 4596 + print_info(f"Total duplicates: {len(duplicates.duplicates)}") 4597 + 4598 + 4599 + def add_duplicate(git_store: GitStore, duplicate_id: Optional[str], canonical_id: Optional[str]) -> None: 4600 + """Add a duplicate mapping.""" 4601 + if not duplicate_id: 4602 + print_error("Duplicate ID is required") 4603 + raise typer.Exit(1) 4604 + 4605 + if not canonical_id: 4606 + print_error("Canonical ID is required") 4607 + raise typer.Exit(1) 4608 + 4609 + # Check if duplicate_id already exists 4610 + duplicates = git_store.get_duplicates() 4611 + if duplicates.is_duplicate(duplicate_id): 4612 + existing_canonical = duplicates.get_canonical(duplicate_id) 4613 + print_error(f"Duplicate ID already mapped to: {existing_canonical}") 4614 + print_error("Use 'remove' first to change the mapping") 4615 + raise typer.Exit(1) 4616 + 4617 + # Check if we're trying to make a canonical ID point to itself 4618 + if duplicate_id == canonical_id: 4619 + print_error("Duplicate ID cannot be the same as canonical ID") 4620 + raise typer.Exit(1) 4621 + 4622 + # Add the mapping 4623 + git_store.add_duplicate(duplicate_id, canonical_id) 4624 + 4625 + # Commit changes 4626 + git_store.commit_changes(f"Add duplicate mapping: {duplicate_id} -> {canonical_id}") 4627 + 4628 + print_success(f"Added duplicate mapping: {duplicate_id} -> {canonical_id}") 4629 + 4630 + 4631 + def remove_duplicate(git_store: GitStore, duplicate_id: Optional[str]) -> None: 4632 + """Remove a duplicate mapping.""" 4633 + if not duplicate_id: 4634 + print_error("Duplicate ID is required") 4635 + raise typer.Exit(1) 4636 + 4637 + # Check if mapping exists 4638 + duplicates = git_store.get_duplicates() 4639 + if not duplicates.is_duplicate(duplicate_id): 4640 + print_error(f"No duplicate mapping found for: {duplicate_id}") 4641 + raise typer.Exit(1) 4642 + 4643 + canonical_id = duplicates.get_canonical(duplicate_id) 4644 + 4645 + # Remove the mapping 4646 + if git_store.remove_duplicate(duplicate_id): 4647 + # Commit changes 4648 + git_store.commit_changes(f"Remove duplicate mapping: {duplicate_id} -> {canonical_id}") 4649 + print_success(f"Removed duplicate mapping: {duplicate_id} -> {canonical_id}") 4650 + else: 4651 + print_error(f"Failed to remove duplicate mapping: {duplicate_id}") 4652 + raise typer.Exit(1) 4653 + </file> 4654 + 4655 + <file path="src/thicket/cli/commands/sync.py"> 4656 + """Sync command for thicket.""" 4657 + 4658 + import asyncio 4659 + from pathlib import Path 4660 + from typing import Optional 4661 + 4662 + import typer 4663 + from rich.progress import track 4664 + 4665 + from ...core.feed_parser import FeedParser 4666 + from ...core.git_store import GitStore 4667 + from ..main import app 4668 + from ..utils import ( 4669 + load_config, 4670 + print_error, 4671 + print_info, 4672 + print_success, 4673 + ) 4674 + 4675 + 4676 + @app.command() 4677 + def sync( 4678 + all_users: bool = typer.Option( 4679 + False, "--all", "-a", help="Sync all users and feeds" 4680 + ), 4681 + user: Optional[str] = typer.Option( 4682 + None, "--user", "-u", help="Sync specific user only" 4683 + ), 4684 + config_file: Optional[Path] = typer.Option( 4685 + Path("thicket.yaml"), "--config", help="Configuration file path" 4686 + ), 4687 + dry_run: bool = typer.Option( 4688 + False, "--dry-run", help="Show what would be synced without making changes" 4689 + ), 4690 + ) -> None: 4691 + """Sync feeds and store entries in Git repository.""" 4692 + 4693 + # Load configuration 4694 + config = load_config(config_file) 4695 + 4696 + # Initialize Git store 4697 + git_store = GitStore(config.git_store) 4698 + 4699 + # Determine which users to sync from git repository 4700 + users_to_sync = [] 4701 + if all_users: 4702 + index = git_store._load_index() 4703 + users_to_sync = list(index.users.values()) 4704 + elif user: 4705 + user_metadata = git_store.get_user(user) 4706 + if not user_metadata: 4707 + print_error(f"User '{user}' not found in git repository") 4708 + raise typer.Exit(1) 4709 + users_to_sync = [user_metadata] 4710 + else: 4711 + print_error("Specify --all to sync all users or --user to sync a specific user") 4712 + raise typer.Exit(1) 4713 + 4714 + if not users_to_sync: 4715 + print_info("No users configured to sync") 4716 + return 4717 + 4718 + # Sync each user 4719 + total_new_entries = 0 4720 + total_updated_entries = 0 4721 + 4722 + for user_metadata in users_to_sync: 4723 + print_info(f"Syncing user: {user_metadata.username}") 4724 + 4725 + user_new_entries = 0 4726 + user_updated_entries = 0 4727 + 4728 + # Sync each feed for the user 4729 + for feed_url in track(user_metadata.feeds, description=f"Syncing {user_metadata.username}'s feeds"): 4730 + try: 4731 + new_entries, updated_entries = asyncio.run( 4732 + sync_feed(git_store, user_metadata.username, feed_url, dry_run) 4733 + ) 4734 + user_new_entries += new_entries 4735 + user_updated_entries += updated_entries 4736 + 4737 + except Exception as e: 4738 + print_error(f"Failed to sync feed {feed_url}: {e}") 4739 + continue 4740 + 4741 + print_info(f"User {user_metadata.username}: {user_new_entries} new, {user_updated_entries} updated") 4742 + total_new_entries += user_new_entries 4743 + total_updated_entries += user_updated_entries 4744 + 4745 + # Commit changes if not dry run 4746 + if not dry_run and (total_new_entries > 0 or total_updated_entries > 0): 4747 + commit_message = f"Sync feeds: {total_new_entries} new entries, {total_updated_entries} updated" 4748 + git_store.commit_changes(commit_message) 4749 + print_success(f"Committed changes: {commit_message}") 4750 + 4751 + # Summary 4752 + if dry_run: 4753 + print_info(f"Dry run complete: would sync {total_new_entries} new entries, {total_updated_entries} updated") 4754 + else: 4755 + print_success(f"Sync complete: {total_new_entries} new entries, {total_updated_entries} updated") 4756 + 4757 + 4758 + async def sync_feed(git_store: GitStore, username: str, feed_url, dry_run: bool) -> tuple[int, int]: 4759 + """Sync a single feed for a user.""" 4760 + 4761 + parser = FeedParser() 4762 + 4763 + try: 4764 + # Fetch and parse feed 4765 + content = await parser.fetch_feed(feed_url) 4766 + metadata, entries = parser.parse_feed(content, feed_url) 4767 + 4768 + new_entries = 0 4769 + updated_entries = 0 4770 + 4771 + # Process each entry 4772 + for entry in entries: 4773 + try: 4774 + # Check if entry already exists 4775 + existing_entry = git_store.get_entry(username, entry.id) 4776 + 4777 + if existing_entry: 4778 + # Check if entry has been updated 4779 + if existing_entry.updated != entry.updated: 4780 + if not dry_run: 4781 + git_store.store_entry(username, entry) 4782 + updated_entries += 1 4783 + else: 4784 + # New entry 4785 + if not dry_run: 4786 + git_store.store_entry(username, entry) 4787 + new_entries += 1 4788 + 4789 + except Exception as e: 4790 + print_error(f"Failed to process entry {entry.id}: {e}") 4791 + continue 4792 + 4793 + return new_entries, updated_entries 4794 + 4795 + except Exception as e: 4796 + print_error(f"Failed to sync feed {feed_url}: {e}") 4797 + return 0, 0 4798 + </file> 4799 + 4800 + <file path="src/thicket/models/config.py"> 4801 + """Configuration models for thicket.""" 4802 + 4803 + from pathlib import Path 4804 + from typing import Optional 4805 + 4806 + from pydantic import BaseModel, EmailStr, HttpUrl 4807 + from pydantic_settings import BaseSettings, SettingsConfigDict 4808 + 4809 + 4810 + class UserConfig(BaseModel): 4811 + """Configuration for a single user and their feeds.""" 4812 + 4813 + username: str 4814 + feeds: list[HttpUrl] 4815 + email: Optional[EmailStr] = None 4816 + homepage: Optional[HttpUrl] = None 4817 + icon: Optional[HttpUrl] = None 4818 + display_name: Optional[str] = None 4819 + 4820 + 4821 + class ThicketConfig(BaseSettings): 4822 + """Main configuration for thicket.""" 4823 + 4824 + model_config = SettingsConfigDict( 4825 + env_prefix="THICKET_", 4826 + env_file=".env", 4827 + yaml_file="thicket.yaml", 4828 + case_sensitive=False, 4829 + ) 4830 + 4831 + git_store: Path 4832 + cache_dir: Path 4833 + users: list[UserConfig] = [] 4834 + </file> 4835 + 4836 + <file path="src/thicket/cli/commands/links_cmd.py"> 4837 + """CLI command for extracting and categorizing all outbound links from blog entries.""" 4838 + 4839 + import json 4840 + import re 4841 + from pathlib import Path 4842 + from typing import Dict, List, Optional, Set 4843 + from urllib.parse import urljoin, urlparse 4844 + 4845 + import typer 4846 + from rich.console import Console 4847 + from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn, TaskProgressColumn 4848 + from rich.table import Table 4849 + 4850 + from ...core.git_store import GitStore 4851 + from ..main import app 4852 + from ..utils import load_config, get_tsv_mode 4853 + 4854 + console = Console() 4855 + 4856 + 4857 + class LinkData: 4858 + """Represents a link found in a blog entry.""" 4859 + 4860 + def __init__(self, url: str, entry_id: str, username: str): 4861 + self.url = url 4862 + self.entry_id = entry_id 4863 + self.username = username 4864 + 4865 + def to_dict(self) -> dict: 4866 + """Convert to dictionary for JSON serialization.""" 4867 + return { 4868 + "url": self.url, 4869 + "entry_id": self.entry_id, 4870 + "username": self.username 4871 + } 4872 + 4873 + @classmethod 4874 + def from_dict(cls, data: dict) -> "LinkData": 4875 + """Create from dictionary.""" 4876 + return cls( 4877 + url=data["url"], 4878 + entry_id=data["entry_id"], 4879 + username=data["username"] 4880 + ) 4881 + 4882 + 4883 + class LinkCategorizer: 4884 + """Categorizes links as internal, user, or unknown.""" 4885 + 4886 + def __init__(self, user_domains: Dict[str, Set[str]]): 4887 + self.user_domains = user_domains 4888 + # Create reverse mapping of domain -> username 4889 + self.domain_to_user = {} 4890 + for username, domains in user_domains.items(): 4891 + for domain in domains: 4892 + self.domain_to_user[domain] = username 4893 + 4894 + def categorize_url(self, url: str, source_username: str) -> tuple[str, Optional[str]]: 4895 + """ 4896 + Categorize a URL as 'internal', 'user', or 'unknown'. 4897 + Returns (category, target_username). 4898 + """ 4899 + try: 4900 + parsed = urlparse(url) 4901 + domain = parsed.netloc.lower() 4902 + 4903 + # Check if it's a link to the same user's domain (internal) 4904 + if domain in self.user_domains.get(source_username, set()): 4905 + return "internal", source_username 4906 + 4907 + # Check if it's a link to another user's domain 4908 + if domain in self.domain_to_user: 4909 + return "user", self.domain_to_user[domain] 4910 + 4911 + # Everything else is unknown 4912 + return "unknown", None 4913 + 4914 + except Exception: 4915 + return "unknown", None 4916 + 4917 + 4918 + class LinkExtractor: 4919 + """Extracts and resolves links from blog entries.""" 4920 + 4921 + def __init__(self): 4922 + # Pattern for extracting links from HTML 4923 + self.link_pattern = re.compile(r'<a[^>]+href="([^"]+)"[^>]*>(.*?)</a>', re.IGNORECASE | re.DOTALL) 4924 + self.url_pattern = re.compile(r'https?://[^\s<>"]+') 4925 + 4926 + def extract_links_from_html(self, html_content: str, base_url: str) -> List[tuple[str, str]]: 4927 + """Extract all links from HTML content and resolve them against base URL.""" 4928 + links = [] 4929 + 4930 + # Extract links from <a> tags 4931 + for match in self.link_pattern.finditer(html_content): 4932 + url = match.group(1) 4933 + text = re.sub(r'<[^>]+>', '', match.group(2)).strip() # Remove HTML tags from link text 4934 + 4935 + # Resolve relative URLs against base URL 4936 + resolved_url = urljoin(base_url, url) 4937 + links.append((resolved_url, text)) 4938 + 4939 + return links 4940 + 4941 + 4942 + def extract_links_from_entry(self, entry, username: str, base_url: str) -> List[LinkData]: 4943 + """Extract all links from a blog entry.""" 4944 + links = [] 4945 + 4946 + # Combine all text content for analysis 4947 + content_to_search = [] 4948 + if entry.content: 4949 + content_to_search.append(entry.content) 4950 + if entry.summary: 4951 + content_to_search.append(entry.summary) 4952 + 4953 + for content in content_to_search: 4954 + extracted_links = self.extract_links_from_html(content, base_url) 4955 + 4956 + for url, link_text in extracted_links: 4957 + # Skip empty URLs 4958 + if not url or url.startswith('#'): 4959 + continue 4960 + 4961 + link_data = LinkData( 4962 + url=url, 4963 + entry_id=entry.id, 4964 + username=username 4965 + ) 4966 + 4967 + links.append(link_data) 4968 + 4969 + return links 4970 + 4971 + 4972 + @app.command() 4973 + def links( 4974 + config_file: Optional[Path] = typer.Option( 4975 + Path("thicket.yaml"), 4976 + "--config", 4977 + "-c", 4978 + help="Path to configuration file", 4979 + ), 4980 + output_file: Optional[Path] = typer.Option( 4981 + None, 4982 + "--output", 4983 + "-o", 4984 + help="Path to output unified links file (default: links.json in git store)", 4985 + ), 4986 + verbose: bool = typer.Option( 4987 + False, 4988 + "--verbose", 4989 + "-v", 4990 + help="Show detailed progress information", 4991 + ), 4992 + ) -> None: 4993 + """Extract and categorize all outbound links from blog entries. 4994 + 4995 + This command analyzes all blog entries to extract outbound links, 4996 + resolve them properly with respect to the feed's base URL, and 4997 + categorize them as internal, user, or unknown links. 4998 + 4999 + Creates a unified links.json file containing all link data. 5000 + """ 5001 + try: 5002 + # Load configuration 5003 + config = load_config(config_file) 5004 + 5005 + # Initialize Git store 5006 + git_store = GitStore(config.git_store) 5007 + 5008 + # Build user domain mapping 5009 + if verbose: 5010 + console.print("Building user domain mapping...") 5011 + 5012 + index = git_store._load_index() 5013 + user_domains = {} 5014 + 5015 + for username, user_metadata in index.users.items(): 5016 + domains = set() 5017 + 5018 + # Add domains from feeds 5019 + for feed_url in user_metadata.feeds: 5020 + domain = urlparse(feed_url).netloc.lower() 5021 + if domain: 5022 + domains.add(domain) 5023 + 5024 + # Add domain from homepage 5025 + if user_metadata.homepage: 5026 + domain = urlparse(str(user_metadata.homepage)).netloc.lower() 5027 + if domain: 5028 + domains.add(domain) 5029 + 5030 + user_domains[username] = domains 5031 + 5032 + if verbose: 5033 + console.print(f"Found {len(user_domains)} users with {sum(len(d) for d in user_domains.values())} total domains") 5034 + 5035 + # Initialize components 5036 + link_extractor = LinkExtractor() 5037 + categorizer = LinkCategorizer(user_domains) 5038 + 5039 + # Get all users 5040 + users = list(index.users.keys()) 5041 + 5042 + if not users: 5043 + console.print("[yellow]No users found in Git store[/yellow]") 5044 + raise typer.Exit(0) 5045 + 5046 + # Process all entries 5047 + all_links = [] 5048 + link_categories = {"internal": [], "user": [], "unknown": []} 5049 + link_dict = {} # Dictionary with link URL as key, maps to list of atom IDs 5050 + reverse_dict = {} # Dictionary with atom ID as key, maps to list of URLs 5051 + 5052 + with Progress( 5053 + SpinnerColumn(), 5054 + TextColumn("[progress.description]{task.description}"), 5055 + BarColumn(), 5056 + TaskProgressColumn(), 5057 + console=console, 5058 + ) as progress: 5059 + 5060 + # Count total entries first 5061 + counting_task = progress.add_task("Counting entries...", total=len(users)) 5062 + total_entries = 0 5063 + 5064 + for username in users: 5065 + entries = git_store.list_entries(username) 5066 + total_entries += len(entries) 5067 + progress.advance(counting_task) 5068 + 5069 + progress.remove_task(counting_task) 5070 + 5071 + # Process entries 5072 + processing_task = progress.add_task( 5073 + f"Processing {total_entries} entries...", 5074 + total=total_entries 5075 + ) 5076 + 5077 + for username in users: 5078 + entries = git_store.list_entries(username) 5079 + user_metadata = index.users[username] 5080 + 5081 + # Get base URL for this user (use first feed URL) 5082 + base_url = str(user_metadata.feeds[0]) if user_metadata.feeds else "https://example.com" 5083 + 5084 + for entry in entries: 5085 + # Extract links from this entry 5086 + entry_links = link_extractor.extract_links_from_entry(entry, username, base_url) 5087 + 5088 + # Track unique links per entry 5089 + entry_urls_seen = set() 5090 + 5091 + # Categorize each link 5092 + for link_data in entry_links: 5093 + # Skip if we've already seen this URL in this entry 5094 + if link_data.url in entry_urls_seen: 5095 + continue 5096 + entry_urls_seen.add(link_data.url) 5097 + 5098 + category, target_username = categorizer.categorize_url(link_data.url, username) 5099 + 5100 + # Add to link dictionary (URL as key, maps to list of atom IDs) 5101 + if link_data.url not in link_dict: 5102 + link_dict[link_data.url] = [] 5103 + if link_data.entry_id not in link_dict[link_data.url]: 5104 + link_dict[link_data.url].append(link_data.entry_id) 5105 + 5106 + # Also add to reverse mapping (atom ID -> list of URLs) 5107 + if link_data.entry_id not in reverse_dict: 5108 + reverse_dict[link_data.entry_id] = [] 5109 + if link_data.url not in reverse_dict[link_data.entry_id]: 5110 + reverse_dict[link_data.entry_id].append(link_data.url) 5111 + 5112 + # Add category info to link data for categories tracking 5113 + link_info = link_data.to_dict() 5114 + link_info["category"] = category 5115 + link_info["target_username"] = target_username 5116 + 5117 + all_links.append(link_info) 5118 + link_categories[category].append(link_info) 5119 + 5120 + progress.advance(processing_task) 5121 + 5122 + if verbose and entry_links: 5123 + console.print(f" Found {len(entry_links)} links in {username}:{entry.title[:50]}...") 5124 + 5125 + # Determine output path 5126 + if output_file: 5127 + output_path = output_file 5128 + else: 5129 + output_path = config.git_store / "links.json" 5130 + 5131 + # Save all extracted links (not just filtered ones) 5132 + if verbose: 5133 + console.print("Preparing output data...") 5134 + 5135 + # Build a set of all URLs that correspond to posts in the git database 5136 + registered_urls = set() 5137 + 5138 + # Get all entries from all users and build URL mappings 5139 + for username in users: 5140 + entries = git_store.list_entries(username) 5141 + user_metadata = index.users[username] 5142 + 5143 + for entry in entries: 5144 + # Try to match entry URLs with extracted links 5145 + if hasattr(entry, 'link') and entry.link: 5146 + registered_urls.add(str(entry.link)) 5147 + 5148 + # Also check entry alternate links if they exist 5149 + if hasattr(entry, 'links') and entry.links: 5150 + for link in entry.links: 5151 + if hasattr(link, 'href') and link.href: 5152 + registered_urls.add(str(link.href)) 5153 + 5154 + # Build unified structure with metadata 5155 + unified_links = {} 5156 + reverse_mapping = {} 5157 + 5158 + for url, entry_ids in link_dict.items(): 5159 + unified_links[url] = { 5160 + "referencing_entries": entry_ids 5161 + } 5162 + 5163 + # Find target username if this is a tracked post 5164 + if url in registered_urls: 5165 + for username in users: 5166 + user_domains_set = {domain for domain in user_domains.get(username, [])} 5167 + if any(domain in url for domain in user_domains_set): 5168 + unified_links[url]["target_username"] = username 5169 + break 5170 + 5171 + # Build reverse mapping 5172 + for entry_id in entry_ids: 5173 + if entry_id not in reverse_mapping: 5174 + reverse_mapping[entry_id] = [] 5175 + if url not in reverse_mapping[entry_id]: 5176 + reverse_mapping[entry_id].append(url) 5177 + 5178 + # Create unified output data 5179 + output_data = { 5180 + "links": unified_links, 5181 + "reverse_mapping": reverse_mapping, 5182 + "user_domains": {k: list(v) for k, v in user_domains.items()} 5183 + } 5184 + 5185 + if verbose: 5186 + console.print(f"Found {len(registered_urls)} registered post URLs") 5187 + console.print(f"Found {len(link_dict)} total links, {sum(1 for link in unified_links.values() if 'target_username' in link)} tracked posts") 5188 + 5189 + # Save unified data 5190 + with open(output_path, "w") as f: 5191 + json.dump(output_data, f, indent=2, default=str) 5192 + 5193 + # Show summary 5194 + if not get_tsv_mode(): 5195 + console.print("\n[green]✓ Links extraction completed successfully[/green]") 5196 + 5197 + # Create summary table or TSV output 5198 + if get_tsv_mode(): 5199 + print("Category\tCount\tDescription") 5200 + print(f"Internal\t{len(link_categories['internal'])}\tLinks to same user's domain") 5201 + print(f"User\t{len(link_categories['user'])}\tLinks to other tracked users") 5202 + print(f"Unknown\t{len(link_categories['unknown'])}\tLinks to external sites") 5203 + print(f"Total Extracted\t{len(all_links)}\tAll extracted links") 5204 + print(f"Saved to Output\t{len(output_data['links'])}\tLinks saved to output file") 5205 + print(f"Cross-references\t{sum(1 for link in unified_links.values() if 'target_username' in link)}\tLinks to registered posts only") 5206 + else: 5207 + table = Table(title="Links Summary") 5208 + table.add_column("Category", style="cyan") 5209 + table.add_column("Count", style="green") 5210 + table.add_column("Description", style="white") 5211 + 5212 + table.add_row("Internal", str(len(link_categories["internal"])), "Links to same user's domain") 5213 + table.add_row("User", str(len(link_categories["user"])), "Links to other tracked users") 5214 + table.add_row("Unknown", str(len(link_categories["unknown"])), "Links to external sites") 5215 + table.add_row("Total Extracted", str(len(all_links)), "All extracted links") 5216 + table.add_row("Saved to Output", str(len(output_data['links'])), "Links saved to output file") 5217 + table.add_row("Cross-references", str(sum(1 for link in unified_links.values() if 'target_username' in link)), "Links to registered posts only") 5218 + 5219 + console.print(table) 5220 + 5221 + # Show user links if verbose 5222 + if verbose and link_categories["user"]: 5223 + if get_tsv_mode(): 5224 + print("User Link Source\tUser Link Target\tLink Count") 5225 + user_link_counts = {} 5226 + 5227 + for link in link_categories["user"]: 5228 + key = f"{link['username']} -> {link['target_username']}" 5229 + user_link_counts[key] = user_link_counts.get(key, 0) + 1 5230 + 5231 + for link_pair, count in sorted(user_link_counts.items(), key=lambda x: x[1], reverse=True)[:10]: 5232 + source, target = link_pair.split(" -> ") 5233 + print(f"{source}\t{target}\t{count}") 5234 + else: 5235 + console.print("\n[bold]User-to-user links:[/bold]") 5236 + user_link_counts = {} 5237 + 5238 + for link in link_categories["user"]: 5239 + key = f"{link['username']} -> {link['target_username']}" 5240 + user_link_counts[key] = user_link_counts.get(key, 0) + 1 5241 + 5242 + for link_pair, count in sorted(user_link_counts.items(), key=lambda x: x[1], reverse=True)[:10]: 5243 + console.print(f" {link_pair}: {count} links") 5244 + 5245 + if not get_tsv_mode(): 5246 + console.print(f"\nUnified links data saved to: {output_path}") 5247 + 5248 + except Exception as e: 5249 + console.print(f"[red]Error extracting links: {e}[/red]") 5250 + if verbose: 5251 + console.print_exception() 5252 + raise typer.Exit(1) 5253 + </file> 5254 + 5255 + <file path="src/thicket/cli/commands/list_cmd.py"> 5256 + """List command for thicket.""" 5257 + 5258 + import re 5259 + from pathlib import Path 5260 + from typing import Optional 5261 + 5262 + import typer 5263 + from rich.table import Table 5264 + 5265 + from ...core.git_store import GitStore 5266 + from ..main import app 5267 + from ..utils import ( 5268 + console, 5269 + load_config, 5270 + print_error, 5271 + print_feeds_table, 5272 + print_feeds_table_from_git, 5273 + print_info, 5274 + print_users_table, 5275 + print_users_table_from_git, 5276 + print_entries_tsv, 5277 + get_tsv_mode, 5278 + ) 5279 + 5280 + 5281 + @app.command("list") 5282 + def list_command( 5283 + what: str = typer.Argument(..., help="What to list: 'users', 'feeds', 'entries'"), 5284 + user: Optional[str] = typer.Option( 5285 + None, "--user", "-u", help="Filter by specific user" 5286 + ), 5287 + limit: Optional[int] = typer.Option( 5288 + None, "--limit", "-l", help="Limit number of results" 5289 + ), 5290 + config_file: Optional[Path] = typer.Option( 5291 + Path("thicket.yaml"), "--config", help="Configuration file path" 5292 + ), 5293 + ) -> None: 5294 + """List users, feeds, or entries.""" 5295 + 5296 + # Load configuration 5297 + config = load_config(config_file) 5298 + 5299 + # Initialize Git store 5300 + git_store = GitStore(config.git_store) 5301 + 5302 + if what == "users": 5303 + list_users(git_store) 5304 + elif what == "feeds": 5305 + list_feeds(git_store, user) 5306 + elif what == "entries": 5307 + list_entries(git_store, user, limit) 5308 + else: 5309 + print_error(f"Unknown list type: {what}") 5310 + print_error("Use 'users', 'feeds', or 'entries'") 5311 + raise typer.Exit(1) 5312 + 5313 + 5314 + def list_users(git_store: GitStore) -> None: 5315 + """List all users.""" 5316 + index = git_store._load_index() 5317 + users = list(index.users.values()) 5318 + 5319 + if not users: 5320 + print_info("No users configured") 5321 + return 5322 + 5323 + print_users_table_from_git(users) 5324 + 5325 + 5326 + def list_feeds(git_store: GitStore, username: Optional[str] = None) -> None: 5327 + """List feeds, optionally filtered by user.""" 5328 + if username: 5329 + user = git_store.get_user(username) 5330 + if not user: 5331 + print_error(f"User '{username}' not found") 5332 + raise typer.Exit(1) 5333 + 5334 + if not user.feeds: 5335 + print_info(f"No feeds configured for user '{username}'") 5336 + return 5337 + 5338 + print_feeds_table_from_git(git_store, username) 5339 + 5340 + 5341 + def list_entries(git_store: GitStore, username: Optional[str] = None, limit: Optional[int] = None) -> None: 5342 + """List entries, optionally filtered by user.""" 5343 + 5344 + if username: 5345 + # List entries for specific user 5346 + user = git_store.get_user(username) 5347 + if not user: 5348 + print_error(f"User '{username}' not found") 5349 + raise typer.Exit(1) 5350 + 5351 + entries = git_store.list_entries(username, limit) 5352 + if not entries: 5353 + print_info(f"No entries found for user '{username}'") 5354 + return 5355 + 5356 + print_entries_table([entries], [username]) 5357 + 5358 + else: 5359 + # List entries for all users 5360 + all_entries = [] 5361 + all_usernames = [] 5362 + 5363 + index = git_store._load_index() 5364 + for user in index.users.values(): 5365 + entries = git_store.list_entries(user.username, limit) 5366 + if entries: 5367 + all_entries.append(entries) 5368 + all_usernames.append(user.username) 5369 + 5370 + if not all_entries: 5371 + print_info("No entries found") 5372 + return 5373 + 5374 + print_entries_table(all_entries, all_usernames) 5375 + 5376 + 5377 + def _clean_html_content(content: Optional[str]) -> str: 5378 + """Clean HTML content for display in table.""" 5379 + if not content: 5380 + return "" 5381 + 5382 + # Remove HTML tags 5383 + clean_text = re.sub(r'<[^>]+>', ' ', content) 5384 + # Replace multiple whitespace with single space 5385 + clean_text = re.sub(r'\s+', ' ', clean_text) 5386 + # Strip and limit length 5387 + clean_text = clean_text.strip() 5388 + if len(clean_text) > 100: 5389 + clean_text = clean_text[:97] + "..." 5390 + 5391 + return clean_text 5392 + 5393 + 5394 + def print_entries_table(entries_by_user: list[list], usernames: list[str]) -> None: 5395 + """Print a table of entries.""" 5396 + if get_tsv_mode(): 5397 + print_entries_tsv(entries_by_user, usernames) 5398 + return 5399 + 5400 + table = Table(title="Feed Entries") 5401 + table.add_column("User", style="cyan", no_wrap=True) 5402 + table.add_column("Title", style="bold") 5403 + table.add_column("Updated", style="blue") 5404 + table.add_column("URL", style="green") 5405 + 5406 + # Combine all entries with usernames 5407 + all_entries = [] 5408 + for entries, username in zip(entries_by_user, usernames): 5409 + for entry in entries: 5410 + all_entries.append((username, entry)) 5411 + 5412 + # Sort by updated time (newest first) 5413 + all_entries.sort(key=lambda x: x[1].updated, reverse=True) 5414 + 5415 + for username, entry in all_entries: 5416 + # Format updated time 5417 + updated_str = entry.updated.strftime("%Y-%m-%d %H:%M") 5418 + 5419 + # Truncate title if too long 5420 + title = entry.title 5421 + if len(title) > 50: 5422 + title = title[:47] + "..." 5423 + 5424 + table.add_row( 5425 + username, 5426 + title, 5427 + updated_str, 5428 + str(entry.link), 5429 + ) 5430 + 5431 + console.print(table) 5432 + </file> 5433 + 5434 + <file path="src/thicket/cli/main.py"> 5435 + """Main CLI application using Typer.""" 5436 + 5437 + import typer 5438 + from rich.console import Console 5439 + 5440 + from .. import __version__ 5441 + 5442 + app = typer.Typer( 5443 + name="thicket", 5444 + help="A CLI tool for persisting Atom/RSS feeds in Git repositories", 5445 + no_args_is_help=True, 5446 + rich_markup_mode="rich", 5447 + ) 5448 + 5449 + console = Console() 5450 + 5451 + # Global state for TSV output mode 5452 + tsv_mode = False 5453 + 5454 + 5455 + def version_callback(value: bool) -> None: 5456 + """Show version and exit.""" 5457 + if value: 5458 + console.print(f"thicket version {__version__}") 5459 + raise typer.Exit() 5460 + 5461 + 5462 + @app.callback() 5463 + def main( 5464 + version: bool = typer.Option( 5465 + None, 5466 + "--version", 5467 + "-v", 5468 + help="Show the version and exit", 5469 + callback=version_callback, 5470 + is_eager=True, 5471 + ), 5472 + tsv: bool = typer.Option( 5473 + False, 5474 + "--tsv", 5475 + help="Output in tab-separated values format without truncation", 5476 + ), 5477 + ) -> None: 5478 + """Thicket: A CLI tool for persisting Atom/RSS feeds in Git repositories.""" 5479 + global tsv_mode 5480 + tsv_mode = tsv 5481 + 5482 + 5483 + # Import commands to register them 5484 + from .commands import add, duplicates, generate, index_cmd, info_cmd, init, links_cmd, list_cmd, sync 5485 + 5486 + if __name__ == "__main__": 5487 + app() 5488 + </file> 5489 + 5490 + <file path="src/thicket/core/git_store.py"> 5491 + """Git repository operations for thicket.""" 5492 + 5493 + import json 5494 + from datetime import datetime 5495 + from pathlib import Path 5496 + from typing import Optional 5497 + 5498 + import git 5499 + from git import Repo 5500 + 5501 + from ..models import AtomEntry, DuplicateMap, GitStoreIndex, UserMetadata 5502 + 5503 + 5504 + class GitStore: 5505 + """Manages the Git repository for storing feed entries.""" 5506 + 5507 + def __init__(self, repo_path: Path): 5508 + """Initialize the Git store.""" 5509 + self.repo_path = repo_path 5510 + self.repo: Optional[Repo] = None 5511 + self._ensure_repo() 5512 + 5513 + def _ensure_repo(self) -> None: 5514 + """Ensure the Git repository exists and is initialized.""" 5515 + if not self.repo_path.exists(): 5516 + self.repo_path.mkdir(parents=True, exist_ok=True) 5517 + 5518 + try: 5519 + self.repo = Repo(self.repo_path) 5520 + except git.InvalidGitRepositoryError: 5521 + # Initialize new repository 5522 + self.repo = Repo.init(self.repo_path) 5523 + self._create_initial_structure() 5524 + 5525 + def _create_initial_structure(self) -> None: 5526 + """Create initial Git store structure.""" 5527 + # Create index.json 5528 + index = GitStoreIndex( 5529 + created=datetime.now(), 5530 + last_updated=datetime.now(), 5531 + ) 5532 + self._save_index(index) 5533 + 5534 + # Create duplicates.json 5535 + duplicates = DuplicateMap() 5536 + self._save_duplicates(duplicates) 5537 + 5538 + # Create initial commit 5539 + self.repo.index.add(["index.json", "duplicates.json"]) 5540 + self.repo.index.commit("Initial thicket repository structure") 5541 + 5542 + def _save_index(self, index: GitStoreIndex) -> None: 5543 + """Save the index to index.json.""" 5544 + index_path = self.repo_path / "index.json" 5545 + with open(index_path, "w") as f: 5546 + json.dump(index.model_dump(mode="json", exclude_none=True), f, indent=2, default=str) 5547 + 5548 + def _load_index(self) -> GitStoreIndex: 5549 + """Load the index from index.json.""" 5550 + index_path = self.repo_path / "index.json" 5551 + if not index_path.exists(): 5552 + return GitStoreIndex( 5553 + created=datetime.now(), 5554 + last_updated=datetime.now(), 5555 + ) 5556 + 5557 + with open(index_path) as f: 5558 + data = json.load(f) 5559 + 5560 + return GitStoreIndex(**data) 5561 + 5562 + def _save_duplicates(self, duplicates: DuplicateMap) -> None: 5563 + """Save duplicates map to duplicates.json.""" 5564 + duplicates_path = self.repo_path / "duplicates.json" 5565 + with open(duplicates_path, "w") as f: 5566 + json.dump(duplicates.model_dump(exclude_none=True), f, indent=2) 5567 + 5568 + def _load_duplicates(self) -> DuplicateMap: 5569 + """Load duplicates map from duplicates.json.""" 5570 + duplicates_path = self.repo_path / "duplicates.json" 5571 + if not duplicates_path.exists(): 5572 + return DuplicateMap() 5573 + 5574 + with open(duplicates_path) as f: 5575 + data = json.load(f) 5576 + 5577 + return DuplicateMap(**data) 5578 + 5579 + def add_user(self, username: str, display_name: Optional[str] = None, 5580 + email: Optional[str] = None, homepage: Optional[str] = None, 5581 + icon: Optional[str] = None, feeds: Optional[list[str]] = None) -> UserMetadata: 5582 + """Add a new user to the Git store.""" 5583 + index = self._load_index() 5584 + 5585 + # Create user directory 5586 + user_dir = self.repo_path / username 5587 + user_dir.mkdir(exist_ok=True) 5588 + 5589 + # Create user metadata 5590 + user_metadata = UserMetadata( 5591 + username=username, 5592 + display_name=display_name, 5593 + email=email, 5594 + homepage=homepage, 5595 + icon=icon, 5596 + feeds=feeds or [], 5597 + directory=username, 5598 + created=datetime.now(), 5599 + last_updated=datetime.now(), 5600 + ) 5601 + 5602 + 5603 + # Update index 5604 + index.add_user(user_metadata) 5605 + self._save_index(index) 5606 + 5607 + return user_metadata 5608 + 5609 + def get_user(self, username: str) -> Optional[UserMetadata]: 5610 + """Get user metadata by username.""" 5611 + index = self._load_index() 5612 + return index.get_user(username) 5613 + 5614 + def update_user(self, username: str, **kwargs) -> bool: 5615 + """Update user metadata.""" 5616 + index = self._load_index() 5617 + user = index.get_user(username) 5618 + 5619 + if not user: 5620 + return False 5621 + 5622 + # Update user metadata 5623 + for key, value in kwargs.items(): 5624 + if hasattr(user, key) and value is not None: 5625 + setattr(user, key, value) 5626 + 5627 + user.update_timestamp() 5628 + 5629 + 5630 + # Update index 5631 + index.add_user(user) 5632 + self._save_index(index) 5633 + 5634 + return True 5635 + 5636 + def store_entry(self, username: str, entry: AtomEntry) -> bool: 5637 + """Store an entry in the user's directory.""" 5638 + user = self.get_user(username) 5639 + if not user: 5640 + return False 5641 + 5642 + # Sanitize entry ID for filename 5643 + from .feed_parser import FeedParser 5644 + parser = FeedParser() 5645 + safe_id = parser.sanitize_entry_id(entry.id) 5646 + 5647 + # Create entry file 5648 + user_dir = self.repo_path / user.directory 5649 + entry_path = user_dir / f"{safe_id}.json" 5650 + 5651 + # Check if entry already exists 5652 + entry_exists = entry_path.exists() 5653 + 5654 + # Save entry 5655 + with open(entry_path, "w") as f: 5656 + json.dump(entry.model_dump(mode="json", exclude_none=True), f, indent=2, default=str) 5657 + 5658 + # Update user metadata if new entry 5659 + if not entry_exists: 5660 + index = self._load_index() 5661 + index.update_entry_count(username, 1) 5662 + self._save_index(index) 5663 + 5664 + return True 5665 + 5666 + def get_entry(self, username: str, entry_id: str) -> Optional[AtomEntry]: 5667 + """Get an entry by username and entry ID.""" 5668 + user = self.get_user(username) 5669 + if not user: 5670 + return None 5671 + 5672 + # Sanitize entry ID 5673 + from .feed_parser import FeedParser 5674 + parser = FeedParser() 5675 + safe_id = parser.sanitize_entry_id(entry_id) 5676 + 5677 + entry_path = self.repo_path / user.directory / f"{safe_id}.json" 5678 + if not entry_path.exists(): 5679 + return None 5680 + 5681 + with open(entry_path) as f: 5682 + data = json.load(f) 5683 + 5684 + return AtomEntry(**data) 5685 + 5686 + def list_entries(self, username: str, limit: Optional[int] = None) -> list[AtomEntry]: 5687 + """List entries for a user.""" 5688 + user = self.get_user(username) 5689 + if not user: 5690 + return [] 5691 + 5692 + user_dir = self.repo_path / user.directory 5693 + if not user_dir.exists(): 5694 + return [] 5695 + 5696 + entries = [] 5697 + entry_files = sorted(user_dir.glob("*.json"), key=lambda p: p.stat().st_mtime, reverse=True) 5698 + 5699 + 5700 + if limit: 5701 + entry_files = entry_files[:limit] 5702 + 5703 + for entry_file in entry_files: 5704 + try: 5705 + with open(entry_file) as f: 5706 + data = json.load(f) 5707 + entries.append(AtomEntry(**data)) 5708 + except Exception: 5709 + # Skip invalid entries 5710 + continue 5711 + 5712 + return entries 5713 + 5714 + def get_duplicates(self) -> DuplicateMap: 5715 + """Get the duplicates map.""" 5716 + return self._load_duplicates() 5717 + 5718 + def add_duplicate(self, duplicate_id: str, canonical_id: str) -> None: 5719 + """Add a duplicate mapping.""" 5720 + duplicates = self._load_duplicates() 5721 + duplicates.add_duplicate(duplicate_id, canonical_id) 5722 + self._save_duplicates(duplicates) 5723 + 5724 + def remove_duplicate(self, duplicate_id: str) -> bool: 5725 + """Remove a duplicate mapping.""" 5726 + duplicates = self._load_duplicates() 5727 + result = duplicates.remove_duplicate(duplicate_id) 5728 + self._save_duplicates(duplicates) 5729 + return result 5730 + 5731 + def commit_changes(self, message: str) -> None: 5732 + """Commit all changes to the Git repository.""" 5733 + if not self.repo: 5734 + return 5735 + 5736 + # Add all changes 5737 + self.repo.git.add(A=True) 5738 + 5739 + # Check if there are changes to commit 5740 + if self.repo.index.diff("HEAD"): 5741 + self.repo.index.commit(message) 5742 + 5743 + def get_stats(self) -> dict: 5744 + """Get statistics about the Git store.""" 5745 + index = self._load_index() 5746 + duplicates = self._load_duplicates() 5747 + 5748 + return { 5749 + "total_users": len(index.users), 5750 + "total_entries": index.total_entries, 5751 + "total_duplicates": len(duplicates.duplicates), 5752 + "last_updated": index.last_updated, 5753 + "repository_size": sum(f.stat().st_size for f in self.repo_path.rglob("*") if f.is_file()), 5754 + } 5755 + 5756 + def search_entries(self, query: str, username: Optional[str] = None, 5757 + limit: Optional[int] = None) -> list[tuple[str, AtomEntry]]: 5758 + """Search entries by content.""" 5759 + results = [] 5760 + 5761 + # Get users to search 5762 + index = self._load_index() 5763 + users = [index.get_user(username)] if username else list(index.users.values()) 5764 + users = [u for u in users if u is not None] 5765 + 5766 + for user in users: 5767 + user_dir = self.repo_path / user.directory 5768 + if not user_dir.exists(): 5769 + continue 5770 + 5771 + entry_files = user_dir.glob("*.json") 5772 + 5773 + for entry_file in entry_files: 5774 + try: 5775 + with open(entry_file) as f: 5776 + data = json.load(f) 5777 + 5778 + entry = AtomEntry(**data) 5779 + 5780 + # Simple text search in title, summary, and content 5781 + searchable_text = " ".join(filter(None, [ 5782 + entry.title, 5783 + entry.summary or "", 5784 + entry.content or "", 5785 + ])).lower() 5786 + 5787 + if query.lower() in searchable_text: 5788 + results.append((user.username, entry)) 5789 + 5790 + if limit and len(results) >= limit: 5791 + return results 5792 + 5793 + except Exception: 5794 + # Skip invalid entries 5795 + continue 5796 + 5797 + # Sort by updated time (newest first) 5798 + results.sort(key=lambda x: x[1].updated, reverse=True) 5799 + 5800 + return results[:limit] if limit else results 5801 + </file> 5802 + 5803 + <file path="ARCH.md"> 5804 + # Thicket Architecture Design 5805 + 5806 + ## Overview 5807 + Thicket is a modern CLI tool for persisting Atom/RSS feeds in a Git repository, designed to enable distributed webblog comment structures. 5808 + 5809 + ## Technology Stack 5810 + 5811 + ### Core Libraries 5812 + 5813 + #### CLI Framework 5814 + - **Typer** (0.15.x) - Modern CLI framework with type hints 5815 + - **Rich** (13.x) - Beautiful terminal output, progress bars, and tables 5816 + - **prompt-toolkit** - Interactive prompts when needed 5817 + 5818 + #### Feed Processing 5819 + - **feedparser** (6.0.11) - Universal feed parser supporting RSS 0.9x, RSS 1.0, RSS 2.0, CDF, Atom 0.3, and Atom 1.0 5820 + - Alternative: **atoma** for stricter Atom/RSS parsing with JSON feed support 5821 + - Alternative: **fastfeedparser** for high-performance parsing (10x faster) 5822 + 5823 + #### Git Integration 5824 + - **GitPython** (3.1.44) - High-level git operations, requires git CLI 5825 + - Alternative: **pygit2** (1.18.0) - Direct libgit2 bindings, better for authentication 5826 + 5827 + #### HTTP Client 5828 + - **httpx** (0.28.x) - Modern async/sync HTTP client with connection pooling 5829 + - **aiohttp** (3.11.x) - For async-only operations if needed 5830 + 5831 + #### Configuration & Data Models 5832 + - **pydantic** (2.11.x) - Data validation and settings management 5833 + - **pydantic-settings** (2.10.x) - Configuration file handling with env var support 5834 + 5835 + #### Utilities 5836 + - **pendulum** (3.x) - Better datetime handling 5837 + - **bleach** (6.x) - HTML sanitization for feed content 5838 + - **platformdirs** (4.x) - Cross-platform directory paths 5839 + 5840 + ## Project Structure 5841 + 5842 + ``` 5843 + thicket/ 5844 + ├── pyproject.toml # Modern Python packaging 5845 + ├── README.md # Project documentation 5846 + ├── ARCH.md # This file 5847 + ├── CLAUDE.md # Project instructions 5848 + ├── .gitignore 5849 + ├── src/ 5850 + │ └── thicket/ 5851 + │ ├── __init__.py 5852 + │ ├── __main__.py # Entry point for `python -m thicket` 5853 + │ ├── cli/ # CLI commands and interface 5854 + │ │ ├── __init__.py 5855 + │ │ ├── main.py # Main CLI app with Typer 5856 + │ │ ├── commands/ # Subcommands 5857 + │ │ │ ├── __init__.py 5858 + │ │ │ ├── init.py # Initialize git store 5859 + │ │ │ ├── add.py # Add users and feeds 5860 + │ │ │ ├── sync.py # Sync feeds 5861 + │ │ │ ├── list_cmd.py # List users/feeds 5862 + │ │ │ ├── duplicates.py # Manage duplicate entries 5863 + │ │ │ ├── links_cmd.py # Extract and categorize links 5864 + │ │ │ └── index_cmd.py # Build reference index and show threads 5865 + │ │ └── utils.py # CLI utilities (progress, formatting) 5866 + │ ├── core/ # Core business logic 5867 + │ │ ├── __init__.py 5868 + │ │ ├── feed_parser.py # Feed parsing and normalization 5869 + │ │ ├── git_store.py # Git repository operations 5870 + │ │ └── reference_parser.py # Link extraction and threading 5871 + │ ├── models/ # Pydantic data models 5872 + │ │ ├── __init__.py 5873 + │ │ ├── config.py # Configuration models 5874 + │ │ ├── feed.py # Feed/Entry models 5875 + │ │ └── user.py # User metadata models 5876 + │ └── utils/ # Shared utilities 5877 + │ └── __init__.py 5878 + ├── tests/ 5879 + │ ├── __init__.py 5880 + │ ├── conftest.py # pytest configuration 5881 + │ ├── test_feed_parser.py 5882 + │ ├── test_git_store.py 5883 + │ └── fixtures/ # Test data 5884 + │ └── feeds/ 5885 + └── docs/ 5886 + └── examples/ # Example configurations 5887 + ``` 5888 + 5889 + ## Data Models 5890 + 5891 + ### Configuration File (YAML/TOML) 5892 + ```python 5893 + class ThicketConfig(BaseSettings): 5894 + git_store: Path # Git repository location 5895 + cache_dir: Path # Cache directory 5896 + users: list[UserConfig] 5897 + 5898 + model_config = SettingsConfigDict( 5899 + env_prefix="THICKET_", 5900 + env_file=".env", 5901 + yaml_file="thicket.yaml" 5902 + ) 5903 + 5904 + class UserConfig(BaseModel): 5905 + username: str 5906 + feeds: list[HttpUrl] 5907 + email: Optional[EmailStr] = None 5908 + homepage: Optional[HttpUrl] = None 5909 + icon: Optional[HttpUrl] = None 5910 + display_name: Optional[str] = None 5911 + ``` 5912 + 5913 + ### Feed Storage Format 5914 + ```python 5915 + class AtomEntry(BaseModel): 5916 + id: str # Original Atom ID 5917 + title: str 5918 + link: HttpUrl 5919 + updated: datetime 5920 + published: Optional[datetime] 5921 + summary: Optional[str] 5922 + content: Optional[str] # Full body content from Atom entry 5923 + content_type: Optional[str] = "html" # text, html, xhtml 5924 + author: Optional[dict] 5925 + categories: list[str] = [] 5926 + rights: Optional[str] = None # Copyright info 5927 + source: Optional[str] = None # Source feed URL 5928 + # Additional Atom fields preserved during RSS->Atom conversion 5929 + 5930 + model_config = ConfigDict( 5931 + json_encoders={ 5932 + datetime: lambda v: v.isoformat() 5933 + } 5934 + ) 5935 + 5936 + class DuplicateMap(BaseModel): 5937 + """Maps duplicate entry IDs to canonical entry IDs""" 5938 + duplicates: dict[str, str] = {} # duplicate_id -> canonical_id 5939 + comment: str = "Entry IDs that map to the same canonical content" 5940 + 5941 + def add_duplicate(self, duplicate_id: str, canonical_id: str) -> None: 5942 + """Add a duplicate mapping""" 5943 + self.duplicates[duplicate_id] = canonical_id 5944 + 5945 + def remove_duplicate(self, duplicate_id: str) -> bool: 5946 + """Remove a duplicate mapping. Returns True if existed.""" 5947 + return self.duplicates.pop(duplicate_id, None) is not None 5948 + 5949 + def get_canonical(self, entry_id: str) -> str: 5950 + """Get canonical ID for an entry (returns original if not duplicate)""" 5951 + return self.duplicates.get(entry_id, entry_id) 5952 + 5953 + def is_duplicate(self, entry_id: str) -> bool: 5954 + """Check if entry ID is marked as duplicate""" 5955 + return entry_id in self.duplicates 5956 + ``` 5957 + 5958 + ## Git Repository Structure 5959 + ``` 5960 + git-store/ 5961 + ├── index.json # User directory index 5962 + ├── duplicates.json # Manual curation of duplicate entries 5963 + ├── links.json # Unified links, references, and mapping data 5964 + ├── user1/ 5965 + │ ├── entry_id_1.json # Sanitized entry files 5966 + │ ├── entry_id_2.json 5967 + │ └── ... 5968 + └── user2/ 5969 + └── ... 5970 + ``` 5971 + 5972 + ## Key Design Decisions 5973 + 5974 + ### 1. Feed Normalization & Auto-Discovery 5975 + - All RSS feeds converted to Atom format before storage 5976 + - Preserves maximum metadata during conversion 5977 + - Sanitizes HTML content to prevent XSS 5978 + - **Auto-discovery**: Extracts user metadata from feed during `add user` command 5979 + 5980 + ### 2. ID Sanitization 5981 + - Consistent algorithm to convert Atom IDs to safe filenames 5982 + - Handles edge cases (very long IDs, special characters) 5983 + - Maintains reversibility where possible 5984 + 5985 + ### 3. Git Operations 5986 + - Uses GitPython for simplicity (no authentication required) 5987 + - Single main branch for all users and entries 5988 + - Atomic commits per sync operation 5989 + - Meaningful commit messages with feed update summaries 5990 + - Preserves complete history - never delete entries even if they disappear from feeds 5991 + 5992 + ### 4. Caching Strategy 5993 + - HTTP caching with Last-Modified/ETag support 5994 + - Local cache of parsed feeds with TTL 5995 + - Cache invalidation on configuration changes 5996 + - Git store serves as permanent historical archive beyond feed depth limits 5997 + 5998 + ### 5. Error Handling 5999 + - Graceful handling of feed parsing errors 6000 + - Retry logic for network failures 6001 + - Clear error messages with recovery suggestions 6002 + 6003 + ## CLI Command Structure 6004 + 6005 + ```bash 6006 + # Initialize a new git store 6007 + thicket init /path/to/store 6008 + 6009 + # Add a user with feeds (auto-discovers metadata from feed) 6010 + thicket add user "alyssa" \ 6011 + --feed "https://example.com/feed.atom" 6012 + # Auto-populates: email, homepage, icon, display_name from feed metadata 6013 + 6014 + # Add a user with manual overrides 6015 + thicket add user "alyssa" \ 6016 + --feed "https://example.com/feed.atom" \ 6017 + --email "alyssa@example.com" \ 6018 + --homepage "https://alyssa.example.com" \ 6019 + --icon "https://example.com/avatar.png" \ 6020 + --display-name "Alyssa P. Hacker" 6021 + 6022 + # Add additional feed to existing user 6023 + thicket add feed "alyssa" "https://example.com/other-feed.rss" 6024 + 6025 + # Sync all feeds (designed for cron usage) 6026 + thicket sync --all 6027 + 6028 + # Sync specific user 6029 + thicket sync --user alyssa 6030 + 6031 + # List users and their feeds 6032 + thicket list users 6033 + thicket list feeds --user alyssa 6034 + 6035 + # Manage duplicate entries 6036 + thicket duplicates list 6037 + thicket duplicates add <entry_id_1> <entry_id_2> # Mark as duplicates 6038 + thicket duplicates remove <entry_id_1> <entry_id_2> # Unmark duplicates 6039 + 6040 + # Link processing and threading 6041 + thicket links --verbose # Extract and categorize all links 6042 + thicket index --verbose # Build reference index for threading 6043 + thicket threads # Show conversation threads 6044 + thicket threads --username user1 # Show threads for specific user 6045 + thicket threads --min-size 3 # Show threads with minimum size 6046 + ``` 6047 + 6048 + ## Performance Considerations 6049 + 6050 + 1. **Concurrent Feed Fetching**: Use httpx with asyncio for parallel downloads 6051 + 2. **Incremental Updates**: Only fetch/parse feeds that have changed 6052 + 3. **Efficient Git Operations**: Batch commits, use shallow clones where appropriate 6053 + 4. **Progress Feedback**: Rich progress bars for long operations 6054 + 6055 + ## Security Considerations 6056 + 6057 + 1. **HTML Sanitization**: Use bleach to clean feed content 6058 + 2. **URL Validation**: Strict validation of feed URLs 6059 + 3. **Git Security**: No credentials stored in repository 6060 + 4. **Path Traversal**: Careful sanitization of filenames 6061 + 6062 + ## Future Enhancements 6063 + 6064 + 1. **Web Interface**: Optional web UI for browsing the git store 6065 + 2. **Webhooks**: Notify external services on feed updates 6066 + 3. **Feed Discovery**: Auto-discover feeds from HTML pages 6067 + 4. **Export Formats**: Generate static sites, OPML exports 6068 + 5. **Federation**: P2P sync between thicket instances 6069 + 6070 + ## Requirements Clarification 6071 + 6072 + **✓ Resolved Requirements:** 6073 + 1. **Feed Update Frequency**: Designed for cron usage - no built-in scheduling needed 6074 + 2. **Duplicate Handling**: Manual curation via `duplicates.json` file with CLI commands 6075 + 3. **Git Branching**: Single main branch for all users and entries 6076 + 4. **Authentication**: No feeds require authentication currently 6077 + 5. **Content Storage**: Store complete Atom entry body content as provided 6078 + 6. **Deleted Entries**: Preserve all entries in Git store permanently (historical archive) 6079 + 7. **History Depth**: Git store maintains full history beyond feed depth limits 6080 + 8. **Feed Auto-Discovery**: Extract user metadata from feed during `add user` command 6081 + 6082 + ## Duplicate Entry Management 6083 + 6084 + ### Duplicate Detection Strategy 6085 + - **Manual Curation**: Duplicates identified and managed manually via CLI 6086 + - **Storage**: `duplicates.json` file in Git root maps entry IDs to canonical entries 6087 + - **Structure**: `{"duplicate_id": "canonical_id", ...}` 6088 + - **CLI Commands**: Add/remove duplicate mappings with validation 6089 + - **Query Resolution**: Search/list commands resolve duplicates to canonical entries 6090 + 6091 + ### Duplicate File Format 6092 + ```json 6093 + { 6094 + "https://example.com/feed/entry/123": "https://canonical.com/posts/same-post", 6095 + "https://mirror.com/articles/456": "https://canonical.com/posts/same-post", 6096 + "comment": "Entry IDs that map to the same canonical content" 6097 + } 6098 + ``` 6099 + 6100 + ## Feed Metadata Auto-Discovery 6101 + 6102 + ### Extraction Strategy 6103 + When adding a new user with `thicket add user`, the system fetches and parses the feed to extract: 6104 + 6105 + - **Display Name**: From `feed.title` or `feed.author.name` 6106 + - **Email**: From `feed.author.email` or `feed.managingEditor` 6107 + - **Homepage**: From `feed.link` or `feed.author.uri` 6108 + - **Icon**: From `feed.logo`, `feed.icon`, or `feed.image.url` 6109 + 6110 + ### Discovery Priority Order 6111 + 1. **Author Information**: Prefer `feed.author.*` fields (more specific to person) 6112 + 2. **Feed-Level**: Fall back to feed-level metadata 6113 + 3. **Manual Override**: CLI flags always take precedence over discovered values 6114 + 4. **Update Behavior**: Auto-discovery only runs during initial `add user`, not on sync 6115 + 6116 + ### Extracted Metadata Format 6117 + ```python 6118 + class FeedMetadata(BaseModel): 6119 + title: Optional[str] = None 6120 + author_name: Optional[str] = None 6121 + author_email: Optional[EmailStr] = None 6122 + author_uri: Optional[HttpUrl] = None 6123 + link: Optional[HttpUrl] = None 6124 + logo: Optional[HttpUrl] = None 6125 + icon: Optional[HttpUrl] = None 6126 + image_url: Optional[HttpUrl] = None 6127 + 6128 + def to_user_config(self, username: str, feed_url: HttpUrl) -> UserConfig: 6129 + """Convert discovered metadata to UserConfig with fallbacks""" 6130 + return UserConfig( 6131 + username=username, 6132 + feeds=[feed_url], 6133 + display_name=self.author_name or self.title, 6134 + email=self.author_email, 6135 + homepage=self.author_uri or self.link, 6136 + icon=self.logo or self.icon or self.image_url 6137 + ) 6138 + ``` 6139 + 6140 + ## Link Processing and Threading Architecture 6141 + 6142 + ### Overview 6143 + The thicket system implements a sophisticated link processing and threading system to create email-style threaded views of blog entries by tracking cross-references between different blogs. 6144 + 6145 + ### Link Processing Pipeline 6146 + 6147 + #### 1. Link Extraction (`thicket links`) 6148 + The `links` command systematically extracts all outbound links from blog entries and categorizes them: 6149 + 6150 + ```python 6151 + class LinkData(BaseModel): 6152 + url: str # Fully resolved URL 6153 + entry_id: str # Source entry ID 6154 + username: str # Source username 6155 + context: str # Surrounding text context 6156 + category: str # "internal", "user", or "unknown" 6157 + target_username: Optional[str] # Target user if applicable 6158 + ``` 6159 + 6160 + **Link Categories:** 6161 + - **Internal**: Links to the same user's domain (self-references) 6162 + - **User**: Links to other tracked users' domains 6163 + - **Unknown**: Links to external sites not tracked by thicket 6164 + 6165 + #### 2. URL Resolution 6166 + All links are properly resolved using the Atom feed's base URL to handle: 6167 + - Relative URLs (converted to absolute) 6168 + - Protocol-relative URLs 6169 + - Fragment identifiers 6170 + - Redirects and canonical URLs 6171 + 6172 + #### 3. Domain Mapping 6173 + The system builds a comprehensive domain mapping from user configuration: 6174 + - Feed URLs → domain extraction 6175 + - Homepage URLs → domain extraction 6176 + - Reverse mapping: domain → username 6177 + 6178 + ### Threading System 6179 + 6180 + #### 1. Reference Index Generation (`thicket index`) 6181 + Creates a bidirectional reference index from the categorized links: 6182 + 6183 + ```python 6184 + class BlogReference(BaseModel): 6185 + source_entry_id: str 6186 + source_username: str 6187 + target_url: str 6188 + target_username: Optional[str] 6189 + target_entry_id: Optional[str] 6190 + context: str 6191 + ``` 6192 + 6193 + #### 2. Thread Detection Algorithm 6194 + Uses graph traversal to find connected blog entries: 6195 + - **Outbound references**: Links from an entry to other entries 6196 + - **Inbound references**: Links to an entry from other entries 6197 + - **Thread members**: All entries connected through references 6198 + 6199 + #### 3. Threading Display (`thicket threads`) 6200 + Creates email-style threaded views: 6201 + - Chronological ordering within threads 6202 + - Reference counts (outbound/inbound) 6203 + - Context preservation 6204 + - Filtering options (user, entry, minimum size) 6205 + 6206 + ### Data Structures 6207 + 6208 + #### links.json Format (Unified Structure) 6209 + ```json 6210 + { 6211 + "links": { 6212 + "https://example.com/post/123": { 6213 + "referencing_entries": ["https://blog.user.com/entry/456"], 6214 + "target_username": "user2" 6215 + }, 6216 + "https://external-site.com/article": { 6217 + "referencing_entries": ["https://blog.user.com/entry/789"] 6218 + } 6219 + }, 6220 + "reverse_mapping": { 6221 + "https://blog.user.com/entry/456": ["https://example.com/post/123"], 6222 + "https://blog.user.com/entry/789": ["https://external-site.com/article"] 6223 + }, 6224 + "references": [ 6225 + { 6226 + "source_entry_id": "https://blog.user.com/entry/456", 6227 + "source_username": "user1", 6228 + "target_url": "https://example.com/post/123", 6229 + "target_username": "user2", 6230 + "target_entry_id": "https://example.com/post/123", 6231 + "context": "As mentioned in this post..." 6232 + } 6233 + ], 6234 + "user_domains": { 6235 + "user1": ["blog.user.com"], 6236 + "user2": ["example.com"] 6237 + } 6238 + } 6239 + ``` 6240 + 6241 + This unified structure eliminates duplication by: 6242 + - Storing each URL only once with minimal metadata 6243 + - Including all link data, reference data, and mappings in one file 6244 + - Using presence of `target_username` to identify tracked vs external links 6245 + - Providing bidirectional mappings for efficient queries 6246 + 6247 + ### Unified Structure Benefits 6248 + 6249 + - **Eliminates Duplication**: Each URL appears only once with metadata 6250 + - **Single Source of Truth**: All link-related data in one file 6251 + - **Efficient Queries**: Fast lookups for both directions (URL→entries, entry→URLs) 6252 + - **Atomic Updates**: All link data changes together 6253 + - **Reduced I/O**: Fewer file operations 6254 + 6255 + ### Implementation Benefits 6256 + 6257 + 1. **Systematic Link Processing**: All links are extracted and categorized consistently 6258 + 2. **Proper URL Resolution**: Handles relative URLs and base URL resolution correctly 6259 + 3. **Domain-based Categorization**: Automatically identifies user-to-user references 6260 + 4. **Bidirectional Indexing**: Supports both "who links to whom" and "who is linked by whom" 6261 + 5. **Thread Discovery**: Finds conversation threads automatically 6262 + 6. **Rich Context**: Preserves surrounding text for each link 6263 + 7. **Performance**: Pre-computed indexes for fast threading queries 6264 + 6265 + ### CLI Commands 6266 + 6267 + ```bash 6268 + # Extract and categorize all links 6269 + thicket links --verbose 6270 + 6271 + # Build reference index for threading 6272 + thicket index --verbose 6273 + 6274 + # Show all conversation threads 6275 + thicket threads 6276 + 6277 + # Show threads for specific user 6278 + thicket threads --username user1 6279 + 6280 + # Show threads with minimum size 6281 + thicket threads --min-size 3 6282 + ``` 6283 + 6284 + ### Integration with Existing Commands 6285 + 6286 + The link processing system integrates seamlessly with existing thicket commands: 6287 + - `thicket sync` updates entries, requiring `thicket links` to be run afterward 6288 + - `thicket index` uses the output from `thicket links` for improved accuracy 6289 + - `thicket threads` provides the user-facing threading interface 6290 + 6291 + ## Current Implementation Status 6292 + 6293 + ### ✅ Completed Features 6294 + 1. **Core Infrastructure** 6295 + - Modern CLI with Typer and Rich 6296 + - Pydantic data models for type safety 6297 + - Git repository operations with GitPython 6298 + - Feed parsing and normalization with feedparser 6299 + 6300 + 2. **User and Feed Management** 6301 + - `thicket init` - Initialize git store 6302 + - `thicket add` - Add users and feeds with auto-discovery 6303 + - `thicket sync` - Sync feeds with progress tracking 6304 + - `thicket list` - List users, feeds, and entries 6305 + - `thicket duplicates` - Manage duplicate entries 6306 + 6307 + 3. **Link Processing and Threading** 6308 + - `thicket links` - Extract and categorize all outbound links 6309 + - `thicket index` - Build reference index from links 6310 + - `thicket threads` - Display threaded conversation views 6311 + - Proper URL resolution with base URL handling 6312 + - Domain-based link categorization 6313 + - Context preservation for links 6314 + 6315 + ### 📊 System Performance 6316 + - **Link Extraction**: Successfully processes thousands of blog entries 6317 + - **Categorization**: Identifies internal, user, and unknown links 6318 + - **Threading**: Creates email-style threaded views of conversations 6319 + - **Storage**: Efficient JSON-based data structures for links and references 6320 + 6321 + ### 🔧 Current Architecture Highlights 6322 + - **Modular Design**: Clear separation between CLI, core logic, and models 6323 + - **Type Safety**: Comprehensive Pydantic models for data validation 6324 + - **Rich CLI**: Beautiful progress bars, tables, and error handling 6325 + - **Extensible**: Easy to add new commands and features 6326 + - **Git Integration**: All data stored in version-controlled JSON files 6327 + 6328 + ### 🎯 Proven Functionality 6329 + The system has been tested with real blog data and successfully: 6330 + - Extracted 14,396 total links from blog entries 6331 + - Categorized 3,994 internal links, 363 user-to-user links, and 10,039 unknown links 6332 + - Built comprehensive domain mappings for 16 users across 20 domains 6333 + - Generated threaded views showing blog conversation patterns 6334 + 6335 + ### 🚀 Ready for Use 6336 + The thicket system is now fully functional for: 6337 + - Maintaining Git repositories of blog feeds 6338 + - Tracking cross-references between blogs 6339 + - Creating threaded views of blog conversations 6340 + - Discovering blog interaction patterns 6341 + - Building distributed comment systems 6342 + </file> 6343 + 6344 + <file path="src/thicket/cli/utils.py"> 6345 + """CLI utilities and helpers.""" 6346 + 6347 + from pathlib import Path 6348 + from typing import Optional 6349 + 6350 + import typer 6351 + from rich.console import Console 6352 + from rich.progress import Progress, SpinnerColumn, TextColumn 6353 + from rich.table import Table 6354 + 6355 + from ..models import ThicketConfig, UserMetadata 6356 + from ..core.git_store import GitStore 6357 + 6358 + console = Console() 6359 + 6360 + 6361 + def get_tsv_mode() -> bool: 6362 + """Get the global TSV mode setting.""" 6363 + from .main import tsv_mode 6364 + return tsv_mode 6365 + 6366 + 6367 + def load_config(config_path: Optional[Path] = None) -> ThicketConfig: 6368 + """Load thicket configuration from file or environment.""" 6369 + if config_path and config_path.exists(): 6370 + import yaml 6371 + 6372 + with open(config_path) as f: 6373 + config_data = yaml.safe_load(f) 6374 + 6375 + # Convert to ThicketConfig 6376 + return ThicketConfig(**config_data) 6377 + 6378 + # Try to load from default locations or environment 6379 + try: 6380 + # First try to find thicket.yaml in current directory 6381 + default_config = Path("thicket.yaml") 6382 + if default_config.exists(): 6383 + import yaml 6384 + with open(default_config) as f: 6385 + config_data = yaml.safe_load(f) 6386 + return ThicketConfig(**config_data) 6387 + 6388 + # Fall back to environment variables 6389 + return ThicketConfig() 6390 + except Exception as e: 6391 + console.print(f"[red]Error loading configuration: {e}[/red]") 6392 + console.print("[yellow]Run 'thicket init' to create a new configuration.[/yellow]") 6393 + raise typer.Exit(1) from e 6394 + 6395 + 6396 + def save_config(config: ThicketConfig, config_path: Path) -> None: 6397 + """Save thicket configuration to file.""" 6398 + import yaml 6399 + 6400 + config_data = config.model_dump(mode="json", exclude_none=True) 6401 + 6402 + # Convert Path objects to strings for YAML serialization 6403 + config_data["git_store"] = str(config_data["git_store"]) 6404 + config_data["cache_dir"] = str(config_data["cache_dir"]) 6405 + 6406 + with open(config_path, "w") as f: 6407 + yaml.dump(config_data, f, default_flow_style=False, sort_keys=False) 6408 + 6409 + 6410 + def create_progress() -> Progress: 6411 + """Create a Rich progress display.""" 6412 + return Progress( 6413 + SpinnerColumn(), 6414 + TextColumn("[progress.description]{task.description}"), 6415 + console=console, 6416 + transient=True, 6417 + ) 6418 + 6419 + 6420 + def print_users_table(config: ThicketConfig) -> None: 6421 + """Print a table of users and their feeds.""" 6422 + if get_tsv_mode(): 6423 + print_users_tsv(config) 6424 + return 6425 + 6426 + table = Table(title="Users and Feeds") 6427 + table.add_column("Username", style="cyan", no_wrap=True) 6428 + table.add_column("Display Name", style="magenta") 6429 + table.add_column("Email", style="blue") 6430 + table.add_column("Homepage", style="green") 6431 + table.add_column("Feeds", style="yellow") 6432 + 6433 + for user in config.users: 6434 + feeds_str = "\n".join(str(feed) for feed in user.feeds) 6435 + table.add_row( 6436 + user.username, 6437 + user.display_name or "", 6438 + user.email or "", 6439 + str(user.homepage) if user.homepage else "", 6440 + feeds_str, 6441 + ) 6442 + 6443 + console.print(table) 6444 + 6445 + 6446 + def print_feeds_table(config: ThicketConfig, username: Optional[str] = None) -> None: 6447 + """Print a table of feeds, optionally filtered by username.""" 6448 + if get_tsv_mode(): 6449 + print_feeds_tsv(config, username) 6450 + return 6451 + 6452 + table = Table(title=f"Feeds{f' for {username}' if username else ''}") 6453 + table.add_column("Username", style="cyan", no_wrap=True) 6454 + table.add_column("Feed URL", style="blue") 6455 + table.add_column("Status", style="green") 6456 + 6457 + users = [config.find_user(username)] if username else config.users 6458 + users = [u for u in users if u is not None] 6459 + 6460 + for user in users: 6461 + for feed in user.feeds: 6462 + table.add_row( 6463 + user.username, 6464 + str(feed), 6465 + "Active", # TODO: Add actual status checking 6466 + ) 6467 + 6468 + console.print(table) 6469 + 6470 + 6471 + def confirm_action(message: str, default: bool = False) -> bool: 6472 + """Prompt for confirmation.""" 6473 + return typer.confirm(message, default=default) 6474 + 6475 + 6476 + def print_success(message: str) -> None: 6477 + """Print a success message.""" 6478 + console.print(f"[green]✓[/green] {message}") 6479 + 6480 + 6481 + def print_error(message: str) -> None: 6482 + """Print an error message.""" 6483 + console.print(f"[red]✗[/red] {message}") 6484 + 6485 + 6486 + def print_warning(message: str) -> None: 6487 + """Print a warning message.""" 6488 + console.print(f"[yellow]⚠[/yellow] {message}") 6489 + 6490 + 6491 + def print_info(message: str) -> None: 6492 + """Print an info message.""" 6493 + console.print(f"[blue]ℹ[/blue] {message}") 6494 + 6495 + 6496 + def print_users_table_from_git(users: list[UserMetadata]) -> None: 6497 + """Print a table of users from git repository.""" 6498 + if get_tsv_mode(): 6499 + print_users_tsv_from_git(users) 6500 + return 6501 + 6502 + table = Table(title="Users and Feeds") 6503 + table.add_column("Username", style="cyan", no_wrap=True) 6504 + table.add_column("Display Name", style="magenta") 6505 + table.add_column("Email", style="blue") 6506 + table.add_column("Homepage", style="green") 6507 + table.add_column("Feeds", style="yellow") 6508 + 6509 + for user in users: 6510 + feeds_str = "\n".join(user.feeds) 6511 + table.add_row( 6512 + user.username, 6513 + user.display_name or "", 6514 + user.email or "", 6515 + user.homepage or "", 6516 + feeds_str, 6517 + ) 6518 + 6519 + console.print(table) 6520 + 6521 + 6522 + def print_feeds_table_from_git(git_store: GitStore, username: Optional[str] = None) -> None: 6523 + """Print a table of feeds from git repository.""" 6524 + if get_tsv_mode(): 6525 + print_feeds_tsv_from_git(git_store, username) 6526 + return 6527 + 6528 + table = Table(title=f"Feeds{f' for {username}' if username else ''}") 6529 + table.add_column("Username", style="cyan", no_wrap=True) 6530 + table.add_column("Feed URL", style="blue") 6531 + table.add_column("Status", style="green") 6532 + 6533 + if username: 6534 + user = git_store.get_user(username) 6535 + users = [user] if user else [] 6536 + else: 6537 + index = git_store._load_index() 6538 + users = list(index.users.values()) 6539 + 6540 + for user in users: 6541 + for feed in user.feeds: 6542 + table.add_row( 6543 + user.username, 6544 + feed, 6545 + "Active", # TODO: Add actual status checking 6546 + ) 6547 + 6548 + console.print(table) 6549 + 6550 + 6551 + def print_users_tsv(config: ThicketConfig) -> None: 6552 + """Print users in TSV format.""" 6553 + print("Username\tDisplay Name\tEmail\tHomepage\tFeeds") 6554 + for user in config.users: 6555 + feeds_str = ",".join(str(feed) for feed in user.feeds) 6556 + print(f"{user.username}\t{user.display_name or ''}\t{user.email or ''}\t{user.homepage or ''}\t{feeds_str}") 6557 + 6558 + 6559 + def print_users_tsv_from_git(users: list[UserMetadata]) -> None: 6560 + """Print users from git repository in TSV format.""" 6561 + print("Username\tDisplay Name\tEmail\tHomepage\tFeeds") 6562 + for user in users: 6563 + feeds_str = ",".join(user.feeds) 6564 + print(f"{user.username}\t{user.display_name or ''}\t{user.email or ''}\t{user.homepage or ''}\t{feeds_str}") 6565 + 6566 + 6567 + def print_feeds_tsv(config: ThicketConfig, username: Optional[str] = None) -> None: 6568 + """Print feeds in TSV format.""" 6569 + print("Username\tFeed URL\tStatus") 6570 + users = [config.find_user(username)] if username else config.users 6571 + users = [u for u in users if u is not None] 6572 + 6573 + for user in users: 6574 + for feed in user.feeds: 6575 + print(f"{user.username}\t{feed}\tActive") 6576 + 6577 + 6578 + def print_feeds_tsv_from_git(git_store: GitStore, username: Optional[str] = None) -> None: 6579 + """Print feeds from git repository in TSV format.""" 6580 + print("Username\tFeed URL\tStatus") 6581 + 6582 + if username: 6583 + user = git_store.get_user(username) 6584 + users = [user] if user else [] 6585 + else: 6586 + index = git_store._load_index() 6587 + users = list(index.users.values()) 6588 + 6589 + for user in users: 6590 + for feed in user.feeds: 6591 + print(f"{user.username}\t{feed}\tActive") 6592 + 6593 + 6594 + def print_entries_tsv(entries_by_user: list[list], usernames: list[str]) -> None: 6595 + """Print entries in TSV format.""" 6596 + print("User\tAtom ID\tTitle\tUpdated\tURL") 6597 + 6598 + # Combine all entries with usernames 6599 + all_entries = [] 6600 + for entries, username in zip(entries_by_user, usernames): 6601 + for entry in entries: 6602 + all_entries.append((username, entry)) 6603 + 6604 + # Sort by updated time (newest first) 6605 + all_entries.sort(key=lambda x: x[1].updated, reverse=True) 6606 + 6607 + for username, entry in all_entries: 6608 + # Format updated time 6609 + updated_str = entry.updated.strftime("%Y-%m-%d %H:%M") 6610 + 6611 + # Escape tabs and newlines in title to preserve TSV format 6612 + title = entry.title.replace('\t', ' ').replace('\n', ' ').replace('\r', ' ') 6613 + 6614 + print(f"{username}\t{entry.id}\t{title}\t{updated_str}\t{entry.link}") 6615 + </file> 6616 + 6617 + </files>

+5 -1

src/thicket/__init__.py

··· 1 - """Thicket: A CLI tool for persisting Atom/RSS feeds in Git repositories.""" 1 + """Thicket - A library for managing feed repositories and static site generation.""" 2 2 3 + from .thicket import Thicket 4 + from .models import AtomEntry, UserConfig, ThicketConfig 5 + 6 + __all__ = ["Thicket", "AtomEntry", "UserConfig", "ThicketConfig"] 3 7 __version__ = "0.1.0" 4 8 __author__ = "thicket" 5 9 __email__ = "thicket@example.com"

-5

src/thicket/bots/__init__.py

··· 1 - """Zulip bot integration for thicket.""" 2 - 3 - from .thicket_bot import ThicketBotHandler 4 - 5 - __all__ = ["ThicketBotHandler"]

-7

src/thicket/bots/requirements.txt

··· 1 - # Requirements for Thicket Zulip bot 2 - # These are already included in the main thicket package 3 - pydantic>=2.11.0 4 - GitPython>=3.1.40 5 - feedparser>=6.0.11 6 - httpx>=0.28.0 7 - pyyaml>=6.0.0

-201

src/thicket/bots/test_bot.py

··· 1 - """Test utilities for the Thicket Zulip bot.""" 2 - 3 - import json 4 - from pathlib import Path 5 - from typing import Any, Optional 6 - 7 - from ..models import AtomEntry 8 - from .thicket_bot import ThicketBotHandler 9 - 10 - 11 - class MockBotHandler: 12 - """Mock BotHandler for testing the Thicket bot.""" 13 - 14 - def __init__(self) -> None: 15 - """Initialize mock bot handler.""" 16 - self.storage_data: dict[str, str] = {} 17 - self.sent_messages: list[dict[str, Any]] = [] 18 - self.config_info = { 19 - "full_name": "Thicket Bot", 20 - "email": "thicket-bot@example.com", 21 - } 22 - 23 - def get_config_info(self) -> dict[str, str]: 24 - """Return bot configuration info.""" 25 - return self.config_info 26 - 27 - def send_reply(self, message: dict[str, Any], content: str) -> None: 28 - """Mock sending a reply.""" 29 - reply = { 30 - "type": "reply", 31 - "to": message.get("sender_id"), 32 - "content": content, 33 - "original_message": message, 34 - } 35 - self.sent_messages.append(reply) 36 - 37 - def send_message(self, message: dict[str, Any]) -> None: 38 - """Mock sending a message.""" 39 - self.sent_messages.append(message) 40 - 41 - @property 42 - def storage(self) -> "MockStorage": 43 - """Return mock storage.""" 44 - return MockStorage(self.storage_data) 45 - 46 - 47 - class MockStorage: 48 - """Mock storage for bot state.""" 49 - 50 - def __init__(self, storage_data: dict[str, str]) -> None: 51 - """Initialize with storage data.""" 52 - self.storage_data = storage_data 53 - 54 - def __enter__(self) -> "MockStorage": 55 - """Context manager entry.""" 56 - return self 57 - 58 - def __exit__(self, exc_type: Any, exc_val: Any, exc_tb: Any) -> None: 59 - """Context manager exit.""" 60 - pass 61 - 62 - def get(self, key: str) -> Optional[str]: 63 - """Get value from storage.""" 64 - return self.storage_data.get(key) 65 - 66 - def put(self, key: str, value: str) -> None: 67 - """Put value in storage.""" 68 - self.storage_data[key] = value 69 - 70 - def contains(self, key: str) -> bool: 71 - """Check if key exists in storage.""" 72 - return key in self.storage_data 73 - 74 - 75 - def create_test_message( 76 - content: str, 77 - sender: str = "Test User", 78 - sender_id: int = 12345, 79 - message_type: str = "stream", 80 - ) -> dict[str, Any]: 81 - """Create a test message for bot testing.""" 82 - return { 83 - "content": content, 84 - "sender_full_name": sender, 85 - "sender_id": sender_id, 86 - "type": message_type, 87 - "timestamp": 1642694400, # 2022-01-20 12:00:00 UTC 88 - "stream_id": 1, 89 - "subject": "test topic", 90 - } 91 - 92 - 93 - def create_test_entry( 94 - entry_id: str = "test-entry-1", 95 - title: str = "Test Article", 96 - link: str = "https://example.com/test-article", 97 - ) -> AtomEntry: 98 - """Create a test AtomEntry for testing.""" 99 - from datetime import datetime 100 - 101 - from pydantic import HttpUrl 102 - 103 - return AtomEntry( 104 - id=entry_id, 105 - title=title, 106 - link=HttpUrl(link), 107 - updated=datetime(2024, 1, 20, 12, 0, 0), 108 - published=datetime(2024, 1, 20, 10, 0, 0), 109 - summary="This is a test article summary", 110 - content="<p>This is test article content</p>", 111 - author={"name": "Test Author", "email": "author@example.com"}, 112 - ) 113 - 114 - 115 - class BotTester: 116 - """Helper class for testing bot functionality.""" 117 - 118 - def __init__(self, config_path: Optional[Path] = None) -> None: 119 - """Initialize bot tester.""" 120 - self.bot = ThicketBotHandler() 121 - self.handler = MockBotHandler() 122 - 123 - if config_path: 124 - # Configure bot with test config 125 - self.configure_bot(config_path, "test-stream", "test-topic") 126 - 127 - def configure_bot( 128 - self, config_path: Path, stream: str = "test-stream", topic: str = "test-topic" 129 - ) -> None: 130 - """Configure the bot for testing.""" 131 - # Set bot configuration 132 - config_data = { 133 - "stream_name": stream, 134 - "topic_name": topic, 135 - "sync_interval": 300, 136 - "max_entries_per_sync": 10, 137 - "config_path": str(config_path), 138 - } 139 - 140 - self.handler.storage_data["bot_config"] = json.dumps(config_data) 141 - 142 - # Initialize bot 143 - self.bot._load_bot_config(self.handler) 144 - 145 - def send_command( 146 - self, command: str, sender: str = "Test User" 147 - ) -> list[dict[str, Any]]: 148 - """Send a command to the bot and return responses.""" 149 - message = create_test_message(f"@thicket {command}", sender) 150 - 151 - # Clear previous messages 152 - self.handler.sent_messages.clear() 153 - 154 - # Send command 155 - self.bot.handle_message(message, self.handler) 156 - 157 - return self.handler.sent_messages.copy() 158 - 159 - def get_last_response_content(self) -> Optional[str]: 160 - """Get the content of the last bot response.""" 161 - if self.handler.sent_messages: 162 - return self.handler.sent_messages[-1].get("content") 163 - return None 164 - 165 - def get_last_message(self) -> Optional[dict[str, Any]]: 166 - """Get the last sent message.""" 167 - if self.handler.sent_messages: 168 - return self.handler.sent_messages[-1] 169 - return None 170 - 171 - def assert_response_contains(self, text: str) -> None: 172 - """Assert that the last response contains specific text.""" 173 - content = self.get_last_response_content() 174 - assert content is not None, "No response received" 175 - assert text in content, f"Response does not contain '{text}': {content}" 176 - 177 - 178 - # Example usage for testing 179 - if __name__ == "__main__": 180 - # Create a test config file 181 - test_config = Path("/tmp/test_thicket.yaml") 182 - 183 - # Create bot tester 184 - tester = BotTester() 185 - 186 - # Test help command 187 - responses = tester.send_command("help") 188 - print(f"Help response: {tester.get_last_response_content()}") 189 - 190 - # Test status command 191 - responses = tester.send_command("status") 192 - print(f"Status response: {tester.get_last_response_content()}") 193 - 194 - # Test configuration 195 - responses = tester.send_command("config stream general") 196 - tester.assert_response_contains("Stream set to") 197 - 198 - responses = tester.send_command("config topic 'Feed Updates'") 199 - tester.assert_response_contains("Topic set to") 200 - 201 - print("All tests passed!")

-1257

src/thicket/bots/thicket_bot.py

··· 1 - """Zulip bot for automatically posting thicket feed updates.""" 2 - 3 - import asyncio 4 - import json 5 - import logging 6 - import os 7 - import time 8 - from pathlib import Path 9 - from typing import Any, Optional 10 - 11 - from zulip_bots.lib import BotHandler 12 - 13 - # Handle imports for both direct execution and package import 14 - try: 15 - from ..cli.commands.sync import sync_feed 16 - from ..core.git_store import GitStore 17 - from ..models import AtomEntry, ThicketConfig 18 - except ImportError: 19 - # When run directly by zulip-bots, add the package to path 20 - import sys 21 - 22 - src_dir = Path(__file__).parent.parent.parent 23 - if str(src_dir) not in sys.path: 24 - sys.path.insert(0, str(src_dir)) 25 - 26 - from thicket.cli.commands.sync import sync_feed 27 - from thicket.core.git_store import GitStore 28 - from thicket.models import AtomEntry, ThicketConfig 29 - 30 - 31 - class ThicketBotHandler: 32 - """Zulip bot that monitors thicket feeds and posts new articles.""" 33 - 34 - def __init__(self) -> None: 35 - """Initialize the thicket bot.""" 36 - self.logger = logging.getLogger(__name__) 37 - self.git_store: Optional[GitStore] = None 38 - self.config: Optional[ThicketConfig] = None 39 - self.posted_entries: set[str] = set() 40 - 41 - # Bot configuration from storage 42 - self.stream_name: Optional[str] = None 43 - self.topic_name: Optional[str] = None 44 - self.sync_interval: int = 300 # 5 minutes default 45 - self.max_entries_per_sync: int = 10 46 - self.config_path: Optional[Path] = None 47 - 48 - # Bot behavior settings (loaded from botrc) 49 - self.rate_limit_delay: int = 5 50 - self.posts_per_batch: int = 5 51 - self.catchup_entries: int = 5 52 - self.config_change_notifications: bool = True 53 - self.username_claim_notifications: bool = True 54 - 55 - # Track last sync time for schedule queries 56 - self.last_sync_time: Optional[float] = None 57 - 58 - # Debug mode configuration 59 - self.debug_user: Optional[str] = None 60 - self.debug_zulip_user_id: Optional[str] = None 61 - 62 - def usage(self) -> str: 63 - """Return bot usage instructions.""" 64 - return """ 65 - **Thicket Feed Bot** 66 - 67 - This bot automatically monitors thicket feeds and posts new articles. 68 - 69 - Commands: 70 - - `@mention status` - Show current bot status and configuration 71 - - `@mention sync now` - Force an immediate sync 72 - - `@mention reset` - Clear posting history (will repost recent entries) 73 - - `@mention config stream <stream_name>` - Set target stream 74 - - `@mention config topic <topic_name>` - Set target topic 75 - - `@mention config interval <seconds>` - Set sync interval 76 - - `@mention schedule` - Show sync schedule and next run time 77 - - `@mention claim <username>` - Claim a thicket username for your Zulip account 78 - - `@mention help` - Show this help message 79 - """ 80 - 81 - def initialize(self, bot_handler: BotHandler) -> None: 82 - """Initialize the bot with persistent storage.""" 83 - self.logger.info("Initializing ThicketBot") 84 - 85 - # Get configuration from environment (set by CLI) 86 - self.debug_user = os.getenv("THICKET_DEBUG_USER") 87 - config_path_env = os.getenv("THICKET_CONFIG_PATH") 88 - if config_path_env: 89 - self.config_path = Path(config_path_env) 90 - self.logger.info(f"Using thicket config: {self.config_path}") 91 - 92 - # Load default configuration from botrc file 93 - self._load_botrc_defaults() 94 - 95 - # Load bot configuration from persistent storage 96 - self._load_bot_config(bot_handler) 97 - 98 - # Initialize thicket components 99 - if self.config_path: 100 - try: 101 - self._initialize_thicket() 102 - self._load_posted_entries(bot_handler) 103 - 104 - # Validate debug mode if enabled 105 - if self.debug_user: 106 - self._validate_debug_mode(bot_handler) 107 - 108 - except Exception as e: 109 - self.logger.error(f"Failed to initialize thicket: {e}") 110 - 111 - # Start background sync loop 112 - self._schedule_sync(bot_handler) 113 - 114 - def handle_message(self, message: dict[str, Any], bot_handler: BotHandler) -> None: 115 - """Handle incoming Zulip messages.""" 116 - content = message["content"].strip() 117 - sender = message["sender_full_name"] 118 - 119 - # Only respond to mentions 120 - if not self._is_mentioned(content, bot_handler): 121 - return 122 - 123 - # Parse command 124 - cleaned_content = self._clean_mention(content, bot_handler) 125 - command_parts = cleaned_content.split() 126 - 127 - if not command_parts: 128 - self._send_help(message, bot_handler) 129 - return 130 - 131 - command = command_parts[0].lower() 132 - 133 - try: 134 - if command == "help": 135 - self._send_help(message, bot_handler) 136 - elif command == "status": 137 - self._send_status(message, bot_handler, sender) 138 - elif ( 139 - command == "sync" 140 - and len(command_parts) > 1 141 - and command_parts[1] == "now" 142 - ): 143 - self._handle_force_sync(message, bot_handler, sender) 144 - elif command == "reset": 145 - self._handle_reset_command(message, bot_handler, sender) 146 - elif command == "config": 147 - self._handle_config_command( 148 - message, bot_handler, command_parts[1:], sender 149 - ) 150 - elif command == "schedule": 151 - self._handle_schedule_command(message, bot_handler, sender) 152 - elif command == "claim": 153 - self._handle_claim_command( 154 - message, bot_handler, command_parts[1:], sender 155 - ) 156 - else: 157 - bot_handler.send_reply( 158 - message, 159 - f"Unknown command: {command}. Type `@mention help` for usage.", 160 - ) 161 - except Exception as e: 162 - self.logger.error(f"Error handling command '{command}': {e}") 163 - bot_handler.send_reply(message, f"Error processing command: {str(e)}") 164 - 165 - def _is_mentioned(self, content: str, bot_handler: BotHandler) -> bool: 166 - """Check if the bot is mentioned in the message.""" 167 - try: 168 - # Get bot's actual name from Zulip 169 - bot_info = bot_handler._client.get_profile() 170 - if bot_info.get("result") == "success": 171 - bot_name = bot_info.get("full_name", "").lower() 172 - if bot_name: 173 - return ( 174 - f"@{bot_name}" in content.lower() 175 - or f"@**{bot_name}**" in content.lower() 176 - ) 177 - except Exception as e: 178 - self.logger.debug(f"Could not get bot profile: {e}") 179 - 180 - # Fallback to generic check 181 - return "@thicket" in content.lower() 182 - 183 - def _clean_mention(self, content: str, bot_handler: BotHandler) -> str: 184 - """Remove bot mention from message content.""" 185 - import re 186 - 187 - try: 188 - # Get bot's actual name from Zulip 189 - bot_info = bot_handler._client.get_profile() 190 - if bot_info.get("result") == "success": 191 - bot_name = bot_info.get("full_name", "") 192 - if bot_name: 193 - # Remove @bot_name or @**bot_name** 194 - escaped_name = re.escape(bot_name) 195 - content = re.sub( 196 - rf"@(?:\*\*)?{escaped_name}(?:\*\*)?", 197 - "", 198 - content, 199 - flags=re.IGNORECASE, 200 - ).strip() 201 - return content 202 - except Exception as e: 203 - self.logger.debug(f"Could not get bot profile for mention cleaning: {e}") 204 - 205 - # Fallback to removing @thicket 206 - content = re.sub( 207 - r"@(?:\*\*)?thicket(?:\*\*)?", "", content, flags=re.IGNORECASE 208 - ).strip() 209 - return content 210 - 211 - def _send_help(self, message: dict[str, Any], bot_handler: BotHandler) -> None: 212 - """Send help message.""" 213 - bot_handler.send_reply(message, self.usage()) 214 - 215 - def _send_status( 216 - self, message: dict[str, Any], bot_handler: BotHandler, sender: str 217 - ) -> None: 218 - """Send bot status information.""" 219 - status_lines = [ 220 - f"**Thicket Bot Status** (requested by {sender})", 221 - "", 222 - ] 223 - 224 - # Debug mode status 225 - if self.debug_user: 226 - status_lines.extend( 227 - [ 228 - "🐛 **Debug Mode:** ENABLED", 229 - f"🎯 **Debug User:** {self.debug_user}", 230 - "", 231 - ] 232 - ) 233 - else: 234 - status_lines.extend( 235 - [ 236 - f"📍 **Stream:** {self.stream_name or 'Not configured'}", 237 - f"📝 **Topic:** {self.topic_name or 'Not configured'}", 238 - "", 239 - ] 240 - ) 241 - 242 - status_lines.extend( 243 - [ 244 - f"⏱️ **Sync Interval:** {self.sync_interval}s ({self.sync_interval // 60}m {self.sync_interval % 60}s)", 245 - f"📊 **Max Entries/Sync:** {self.max_entries_per_sync}", 246 - f"📁 **Config Path:** {self.config_path or 'Not configured'}", 247 - "", 248 - f"📄 **Tracked Entries:** {len(self.posted_entries)}", 249 - f"🔄 **Catchup Mode:** {'Active (first run)' if len(self.posted_entries) == 0 else 'Inactive'}", 250 - f"✅ **Thicket Initialized:** {'Yes' if self.git_store else 'No'}", 251 - "", 252 - self._get_schedule_info(), 253 - ] 254 - ) 255 - 256 - bot_handler.send_reply(message, "\n".join(status_lines)) 257 - 258 - def _handle_force_sync( 259 - self, message: dict[str, Any], bot_handler: BotHandler, sender: str 260 - ) -> None: 261 - """Handle immediate sync request.""" 262 - if not self._check_initialization(message, bot_handler): 263 - return 264 - 265 - bot_handler.send_reply( 266 - message, f"🔄 Starting immediate sync... (requested by {sender})" 267 - ) 268 - 269 - try: 270 - new_entries = self._perform_sync(bot_handler) 271 - bot_handler.send_reply( 272 - message, f"✅ Sync completed! Found {len(new_entries)} new entries." 273 - ) 274 - except Exception as e: 275 - self.logger.error(f"Force sync failed: {e}") 276 - bot_handler.send_reply(message, f"❌ Sync failed: {str(e)}") 277 - 278 - def _handle_reset_command( 279 - self, message: dict[str, Any], bot_handler: BotHandler, sender: str 280 - ) -> None: 281 - """Handle reset command to clear posted entries tracking.""" 282 - try: 283 - self.posted_entries.clear() 284 - self._save_posted_entries(bot_handler) 285 - bot_handler.send_reply( 286 - message, 287 - f"✅ Posting history reset! Recent entries will be posted on next sync. (requested by {sender})", 288 - ) 289 - self.logger.info(f"Posted entries tracking reset by {sender}") 290 - except Exception as e: 291 - self.logger.error(f"Reset failed: {e}") 292 - bot_handler.send_reply(message, f"❌ Reset failed: {str(e)}") 293 - 294 - def _handle_schedule_command( 295 - self, message: dict[str, Any], bot_handler: BotHandler, sender: str 296 - ) -> None: 297 - """Handle schedule query command.""" 298 - schedule_info = self._get_schedule_info() 299 - bot_handler.send_reply( 300 - message, 301 - f"**Thicket Bot Schedule** (requested by {sender})\n\n{schedule_info}", 302 - ) 303 - 304 - def _handle_claim_command( 305 - self, 306 - message: dict[str, Any], 307 - bot_handler: BotHandler, 308 - args: list[str], 309 - sender: str, 310 - ) -> None: 311 - """Handle username claiming command.""" 312 - if not args: 313 - bot_handler.send_reply(message, "Usage: `@mention claim <username>`") 314 - return 315 - 316 - if not self._check_initialization(message, bot_handler): 317 - return 318 - 319 - username = args[0].strip() 320 - 321 - # Get sender's Zulip user info 322 - sender_user_id = message.get("sender_id") 323 - sender_email = message.get("sender_email") 324 - 325 - if not sender_user_id or not sender_email: 326 - bot_handler.send_reply( 327 - message, "❌ Could not determine your Zulip user information." 328 - ) 329 - return 330 - 331 - try: 332 - # Get current Zulip server from environment 333 - zulip_site_url = os.getenv("THICKET_ZULIP_SITE_URL", "") 334 - server_url = zulip_site_url.replace("https://", "").replace("http://", "") 335 - 336 - if not server_url: 337 - bot_handler.send_reply( 338 - message, "❌ Could not determine Zulip server URL." 339 - ) 340 - return 341 - 342 - # Check if username exists in thicket 343 - user = self.git_store.get_user(username) 344 - if not user: 345 - bot_handler.send_reply( 346 - message, 347 - f"❌ Username `{username}` not found in thicket. Available users: {', '.join(self.git_store.list_users())}", 348 - ) 349 - return 350 - 351 - # Check if username is already claimed for this server 352 - existing_zulip_id = user.get_zulip_mention(server_url) 353 - if existing_zulip_id: 354 - # Check if it's claimed by the same user 355 - if existing_zulip_id == sender_email or str(existing_zulip_id) == str( 356 - sender_user_id 357 - ): 358 - bot_handler.send_reply( 359 - message, 360 - f"✅ Username `{username}` is already claimed by you on {server_url}!", 361 - ) 362 - else: 363 - bot_handler.send_reply( 364 - message, 365 - f"❌ Username `{username}` is already claimed by another user on {server_url}.", 366 - ) 367 - return 368 - 369 - # Claim the username - prefer email for consistency 370 - success = self.git_store.add_zulip_association( 371 - username, server_url, sender_email 372 - ) 373 - 374 - if success: 375 - reply_msg = ( 376 - f"🎉 Successfully claimed username `{username}` for **{sender}** on {server_url}!\n" 377 - + "You will now be mentioned when new articles are posted from this user's feeds." 378 - ) 379 - bot_handler.send_reply(message, reply_msg) 380 - 381 - # Send notification to configured stream if enabled and not in debug mode 382 - if ( 383 - self.username_claim_notifications 384 - and not self.debug_user 385 - and self.stream_name 386 - and self.topic_name 387 - ): 388 - try: 389 - notification_msg = f"👋 **{sender}** claimed thicket username `{username}` on {server_url}" 390 - bot_handler.send_message( 391 - { 392 - "type": "stream", 393 - "to": self.stream_name, 394 - "subject": self.topic_name, 395 - "content": notification_msg, 396 - } 397 - ) 398 - except Exception as e: 399 - self.logger.error( 400 - f"Failed to send username claim notification: {e}" 401 - ) 402 - 403 - self.logger.info( 404 - f"User {sender} ({sender_email}) claimed username {username} on {server_url}" 405 - ) 406 - else: 407 - bot_handler.send_reply( 408 - message, 409 - f"❌ Failed to claim username `{username}`. This shouldn't happen - please contact an administrator.", 410 - ) 411 - 412 - except Exception as e: 413 - self.logger.error(f"Error processing claim for {username} by {sender}: {e}") 414 - bot_handler.send_reply(message, f"❌ Error processing claim: {str(e)}") 415 - 416 - def _handle_config_command( 417 - self, 418 - message: dict[str, Any], 419 - bot_handler: BotHandler, 420 - args: list[str], 421 - sender: str, 422 - ) -> None: 423 - """Handle configuration commands.""" 424 - if len(args) < 2: 425 - bot_handler.send_reply( 426 - message, "Usage: `@mention config <setting> <value>`" 427 - ) 428 - return 429 - 430 - setting = args[0].lower() 431 - value = " ".join(args[1:]) 432 - 433 - if setting == "stream": 434 - old_value = self.stream_name 435 - self.stream_name = value 436 - self._save_bot_config(bot_handler) 437 - bot_handler.send_reply( 438 - message, f"✅ Stream set to: **{value}** (by {sender})" 439 - ) 440 - self._send_config_change_notification( 441 - bot_handler, sender, "stream", old_value, value 442 - ) 443 - 444 - elif setting == "topic": 445 - old_value = self.topic_name 446 - self.topic_name = value 447 - self._save_bot_config(bot_handler) 448 - bot_handler.send_reply( 449 - message, f"✅ Topic set to: **{value}** (by {sender})" 450 - ) 451 - self._send_config_change_notification( 452 - bot_handler, sender, "topic", old_value, value 453 - ) 454 - 455 - elif setting == "interval": 456 - try: 457 - interval = int(value) 458 - if interval < 60: 459 - bot_handler.send_reply( 460 - message, "❌ Interval must be at least 60 seconds" 461 - ) 462 - return 463 - old_value = self.sync_interval 464 - self.sync_interval = interval 465 - self._save_bot_config(bot_handler) 466 - bot_handler.send_reply( 467 - message, f"✅ Sync interval set to: **{interval}s** (by {sender})" 468 - ) 469 - self._send_config_change_notification( 470 - bot_handler, 471 - sender, 472 - "sync interval", 473 - f"{old_value}s", 474 - f"{interval}s", 475 - ) 476 - except ValueError: 477 - bot_handler.send_reply( 478 - message, "❌ Invalid interval value. Must be a number of seconds." 479 - ) 480 - 481 - elif setting == "max_entries": 482 - try: 483 - max_entries = int(value) 484 - if max_entries < 1 or max_entries > 50: 485 - bot_handler.send_reply( 486 - message, "❌ Max entries must be between 1 and 50" 487 - ) 488 - return 489 - old_value = self.max_entries_per_sync 490 - self.max_entries_per_sync = max_entries 491 - self._save_bot_config(bot_handler) 492 - bot_handler.send_reply( 493 - message, 494 - f"✅ Max entries per sync set to: **{max_entries}** (by {sender})", 495 - ) 496 - self._send_config_change_notification( 497 - bot_handler, 498 - sender, 499 - "max entries per sync", 500 - str(old_value), 501 - str(max_entries), 502 - ) 503 - except ValueError: 504 - bot_handler.send_reply( 505 - message, "❌ Invalid max entries value. Must be a number." 506 - ) 507 - 508 - else: 509 - bot_handler.send_reply( 510 - message, 511 - f"❌ Unknown setting: {setting}. Available: stream, topic, interval, max_entries", 512 - ) 513 - 514 - def _load_bot_config(self, bot_handler: BotHandler) -> None: 515 - """Load bot configuration from persistent storage.""" 516 - try: 517 - config_data = bot_handler.storage.get("bot_config") 518 - if config_data: 519 - config = json.loads(config_data) 520 - self.stream_name = config.get("stream_name") 521 - self.topic_name = config.get("topic_name") 522 - self.sync_interval = config.get("sync_interval", 300) 523 - self.max_entries_per_sync = config.get("max_entries_per_sync", 10) 524 - self.last_sync_time = config.get("last_sync_time") 525 - except Exception: 526 - # Bot config not found on first run is expected 527 - pass 528 - 529 - def _save_bot_config(self, bot_handler: BotHandler) -> None: 530 - """Save bot configuration to persistent storage.""" 531 - try: 532 - config_data = { 533 - "stream_name": self.stream_name, 534 - "topic_name": self.topic_name, 535 - "sync_interval": self.sync_interval, 536 - "max_entries_per_sync": self.max_entries_per_sync, 537 - "last_sync_time": self.last_sync_time, 538 - } 539 - bot_handler.storage.put("bot_config", json.dumps(config_data)) 540 - except Exception as e: 541 - self.logger.error(f"Error saving bot config: {e}") 542 - 543 - def _load_botrc_defaults(self) -> None: 544 - """Load default configuration from botrc file.""" 545 - try: 546 - import configparser 547 - from pathlib import Path 548 - 549 - botrc_path = Path("bot-config/botrc") 550 - if not botrc_path.exists(): 551 - self.logger.info("No botrc file found, using hardcoded defaults") 552 - return 553 - 554 - config = configparser.ConfigParser() 555 - config.read(botrc_path) 556 - 557 - if "bot" in config: 558 - bot_section = config["bot"] 559 - self.sync_interval = bot_section.getint("sync_interval", 300) 560 - self.max_entries_per_sync = bot_section.getint( 561 - "max_entries_per_sync", 10 562 - ) 563 - self.rate_limit_delay = bot_section.getint("rate_limit_delay", 5) 564 - self.posts_per_batch = bot_section.getint("posts_per_batch", 5) 565 - 566 - # Set defaults only if not already configured 567 - default_stream = bot_section.get("default_stream", "").strip() 568 - default_topic = bot_section.get("default_topic", "").strip() 569 - if default_stream: 570 - self.stream_name = default_stream 571 - if default_topic: 572 - self.topic_name = default_topic 573 - 574 - if "catchup" in config: 575 - catchup_section = config["catchup"] 576 - self.catchup_entries = catchup_section.getint("catchup_entries", 5) 577 - 578 - if "notifications" in config: 579 - notifications_section = config["notifications"] 580 - self.config_change_notifications = notifications_section.getboolean( 581 - "config_change_notifications", True 582 - ) 583 - self.username_claim_notifications = notifications_section.getboolean( 584 - "username_claim_notifications", True 585 - ) 586 - 587 - self.logger.info(f"Loaded configuration from {botrc_path}") 588 - 589 - except Exception as e: 590 - self.logger.error(f"Error loading botrc defaults: {e}") 591 - self.logger.info("Using hardcoded defaults") 592 - 593 - def _initialize_thicket(self) -> None: 594 - """Initialize thicket components.""" 595 - if not self.config_path or not self.config_path.exists(): 596 - raise ValueError("Thicket config file not found") 597 - 598 - # Load thicket configuration 599 - import yaml 600 - 601 - with open(self.config_path) as f: 602 - config_data = yaml.safe_load(f) 603 - self.config = ThicketConfig(**config_data) 604 - 605 - # Initialize git store 606 - self.git_store = GitStore(self.config.git_store) 607 - 608 - self.logger.info("Thicket components initialized successfully") 609 - 610 - def _validate_debug_mode(self, bot_handler: BotHandler) -> None: 611 - """Validate debug mode configuration.""" 612 - if not self.debug_user or not self.git_store: 613 - return 614 - 615 - # Get current Zulip server from environment 616 - zulip_site_url = os.getenv("THICKET_ZULIP_SITE_URL", "") 617 - server_url = zulip_site_url.replace("https://", "").replace("http://", "") 618 - 619 - # Check if debug user exists in thicket 620 - user = self.git_store.get_user(self.debug_user) 621 - if not user: 622 - raise ValueError(f"Debug user '{self.debug_user}' not found in thicket") 623 - 624 - # Check if user has Zulip association for this server 625 - if not server_url: 626 - raise ValueError("Could not determine Zulip server URL") 627 - 628 - zulip_user_id = user.get_zulip_mention(server_url) 629 - if not zulip_user_id: 630 - raise ValueError( 631 - f"User '{self.debug_user}' has no Zulip association for server '{server_url}'" 632 - ) 633 - 634 - # Try to look up the actual Zulip user ID from the email address 635 - # But don't fail if we can't - we'll try again when sending messages 636 - actual_user_id = self._lookup_zulip_user_id(bot_handler, zulip_user_id) 637 - if actual_user_id and actual_user_id != zulip_user_id: 638 - # Successfully resolved to numeric ID 639 - self.debug_zulip_user_id = actual_user_id 640 - self.logger.info( 641 - f"Debug mode enabled: Will send DMs to {self.debug_user} (email: {zulip_user_id}, user_id: {actual_user_id}) on {server_url}" 642 - ) 643 - else: 644 - # Keep the email address, will resolve later when sending 645 - self.debug_zulip_user_id = zulip_user_id 646 - self.logger.info( 647 - f"Debug mode enabled: Will send DMs to {self.debug_user} ({zulip_user_id}) on {server_url} (will resolve user ID when sending)" 648 - ) 649 - 650 - def _lookup_zulip_user_id( 651 - self, bot_handler: BotHandler, email_or_id: str 652 - ) -> Optional[str]: 653 - """Look up Zulip user ID from email address or return the ID if it's already numeric.""" 654 - # If it's already a numeric user ID, return it 655 - if email_or_id.isdigit(): 656 - return email_or_id 657 - 658 - try: 659 - client = bot_handler._client 660 - if not client: 661 - self.logger.error("No Zulip client available for user lookup") 662 - return None 663 - 664 - # First try the get_user_by_email API if available 665 - try: 666 - user_result = client.get_user_by_email(email_or_id) 667 - if user_result.get("result") == "success": 668 - user_data = user_result.get("user", {}) 669 - user_id = user_data.get("user_id") 670 - if user_id: 671 - self.logger.info( 672 - f"Found user ID {user_id} for '{email_or_id}' via get_user_by_email API" 673 - ) 674 - return str(user_id) 675 - except (AttributeError, Exception): 676 - pass 677 - 678 - # Fallback: Get all users and search through them 679 - users_result = client.get_users() 680 - if users_result.get("result") == "success": 681 - for user in users_result["members"]: 682 - user_email = user.get("email", "") 683 - delivery_email = user.get("delivery_email", "") 684 - 685 - if ( 686 - user_email == email_or_id 687 - or delivery_email == email_or_id 688 - or str(user.get("user_id")) == email_or_id 689 - ): 690 - user_id = user.get("user_id") 691 - return str(user_id) 692 - 693 - self.logger.error( 694 - f"No user found with identifier '{email_or_id}'. Searched {len(users_result['members'])} users." 695 - ) 696 - return None 697 - else: 698 - self.logger.error( 699 - f"Failed to get users: {users_result.get('msg', 'Unknown error')}" 700 - ) 701 - return None 702 - 703 - except Exception as e: 704 - self.logger.error(f"Error looking up user ID for '{email_or_id}': {e}") 705 - return None 706 - 707 - def _lookup_zulip_user_info( 708 - self, bot_handler: BotHandler, email_or_id: str 709 - ) -> tuple[Optional[str], Optional[str]]: 710 - """Look up both Zulip user ID and full name from email address.""" 711 - if email_or_id.isdigit(): 712 - return email_or_id, None 713 - 714 - try: 715 - client = bot_handler._client 716 - if not client: 717 - return None, None 718 - 719 - # Try get_user_by_email API first 720 - try: 721 - user_result = client.get_user_by_email(email_or_id) 722 - if user_result.get("result") == "success": 723 - user_data = user_result.get("user", {}) 724 - user_id = user_data.get("user_id") 725 - full_name = user_data.get("full_name", "") 726 - if user_id: 727 - return str(user_id), full_name 728 - except AttributeError: 729 - pass 730 - 731 - # Fallback: search all users 732 - users_result = client.get_users() 733 - if users_result.get("result") == "success": 734 - for user in users_result["members"]: 735 - if ( 736 - user.get("email") == email_or_id 737 - or user.get("delivery_email") == email_or_id 738 - ): 739 - return str(user.get("user_id")), user.get("full_name", "") 740 - 741 - return None, None 742 - 743 - except Exception as e: 744 - self.logger.error(f"Error looking up user info for '{email_or_id}': {e}") 745 - return None, None 746 - 747 - def _load_posted_entries(self, bot_handler: BotHandler) -> None: 748 - """Load the set of already posted entries.""" 749 - try: 750 - posted_data = bot_handler.storage.get("posted_entries") 751 - if posted_data: 752 - self.posted_entries = set(json.loads(posted_data)) 753 - except Exception: 754 - # Empty set on first run is expected 755 - self.posted_entries = set() 756 - 757 - def _save_posted_entries(self, bot_handler: BotHandler) -> None: 758 - """Save the set of posted entries.""" 759 - try: 760 - bot_handler.storage.put( 761 - "posted_entries", json.dumps(list(self.posted_entries)) 762 - ) 763 - except Exception as e: 764 - self.logger.error(f"Error saving posted entries: {e}") 765 - 766 - def _check_initialization( 767 - self, message: dict[str, Any], bot_handler: BotHandler 768 - ) -> bool: 769 - """Check if thicket is properly initialized.""" 770 - if not self.git_store or not self.config: 771 - bot_handler.send_reply( 772 - message, "❌ Thicket not initialized. Please check configuration." 773 - ) 774 - return False 775 - 776 - # In debug mode, we don't need stream/topic configuration 777 - if self.debug_user: 778 - return True 779 - 780 - if not self.stream_name or not self.topic_name: 781 - bot_handler.send_reply( 782 - message, 783 - "❌ Stream and topic must be configured first. Use `@mention config stream <name>` and `@mention config topic <name>`", 784 - ) 785 - return False 786 - 787 - return True 788 - 789 - def _schedule_sync(self, bot_handler: BotHandler) -> None: 790 - """Schedule periodic sync operations.""" 791 - 792 - def sync_loop(): 793 - while True: 794 - try: 795 - # Check if we can sync 796 - can_sync = self.git_store and ( 797 - (self.stream_name and self.topic_name) or self.debug_user 798 - ) 799 - 800 - if can_sync: 801 - self._perform_sync(bot_handler) 802 - 803 - time.sleep(self.sync_interval) 804 - except Exception as e: 805 - self.logger.error(f"Error in sync loop: {e}") 806 - time.sleep(60) # Wait before retrying 807 - 808 - # Start background thread 809 - import threading 810 - 811 - sync_thread = threading.Thread(target=sync_loop, daemon=True) 812 - sync_thread.start() 813 - 814 - def _perform_sync(self, bot_handler: BotHandler) -> list[AtomEntry]: 815 - """Perform thicket sync and return new entries.""" 816 - if not self.config or not self.git_store: 817 - return [] 818 - 819 - new_entries: list[tuple[AtomEntry, str]] = [] # (entry, username) pairs 820 - is_first_run = len(self.posted_entries) == 0 821 - 822 - # Get all users and their feeds from git store 823 - users_with_feeds = self.git_store.list_all_users_with_feeds() 824 - 825 - # Sync each user's feeds 826 - for username, feed_urls in users_with_feeds: 827 - for feed_url in feed_urls: 828 - try: 829 - # Run async sync function 830 - loop = asyncio.new_event_loop() 831 - asyncio.set_event_loop(loop) 832 - try: 833 - new_count, _ = loop.run_until_complete( 834 - sync_feed( 835 - self.git_store, username, str(feed_url), dry_run=False 836 - ) 837 - ) 838 - 839 - entries_to_check = [] 840 - 841 - if new_count > 0: 842 - # Get the newly added entries 843 - entries_to_check = self.git_store.list_entries( 844 - username, limit=new_count 845 - ) 846 - 847 - # Always check for catchup mode on first run 848 - if is_first_run: 849 - # Catchup mode: get configured number of entries on first run 850 - catchup_entries = self.git_store.list_entries( 851 - username, limit=self.catchup_entries 852 - ) 853 - entries_to_check = ( 854 - catchup_entries 855 - if not entries_to_check 856 - else entries_to_check 857 - ) 858 - 859 - for entry in entries_to_check: 860 - entry_key = f"{username}:{entry.id}" 861 - if entry_key not in self.posted_entries: 862 - new_entries.append((entry, username)) 863 - if len(new_entries) >= self.max_entries_per_sync: 864 - break 865 - 866 - finally: 867 - loop.close() 868 - 869 - except Exception as e: 870 - self.logger.error( 871 - f"Error syncing feed {feed_url} for user {username}: {e}" 872 - ) 873 - 874 - if len(new_entries) >= self.max_entries_per_sync: 875 - break 876 - 877 - # Post new entries to Zulip with rate limiting 878 - if new_entries: 879 - posted_count = 0 880 - 881 - for i, (entry, username) in enumerate(new_entries): 882 - self._post_entry_to_zulip(entry, bot_handler, username) 883 - self.posted_entries.add(f"{username}:{entry.id}") 884 - posted_count += 1 885 - 886 - # Rate limiting: pause after configured number of messages 887 - if ( 888 - posted_count % self.posts_per_batch == 0 889 - and i < len(new_entries) - 1 890 - ): 891 - time.sleep(self.rate_limit_delay) 892 - 893 - self._save_posted_entries(bot_handler) 894 - 895 - # Update last sync time 896 - self.last_sync_time = time.time() 897 - 898 - return [entry for entry, _ in new_entries] 899 - 900 - def _post_entry_to_zulip( 901 - self, entry: AtomEntry, bot_handler: BotHandler, username: str 902 - ) -> None: 903 - """Post a single entry to the configured Zulip stream/topic or debug user DM.""" 904 - try: 905 - # Get current Zulip server from environment 906 - zulip_site_url = os.getenv("THICKET_ZULIP_SITE_URL", "") 907 - server_url = zulip_site_url.replace("https://", "").replace("http://", "") 908 - 909 - # Build author/date info consistently 910 - mention_info = "" 911 - if server_url and self.git_store: 912 - user = self.git_store.get_user(username) 913 - if user: 914 - zulip_user_id = user.get_zulip_mention(server_url) 915 - if zulip_user_id: 916 - # Look up the actual Zulip full name for proper @mention 917 - _, zulip_full_name = self._lookup_zulip_user_info( 918 - bot_handler, zulip_user_id 919 - ) 920 - display_name = zulip_full_name or user.display_name or username 921 - 922 - # Check if author is different from the user - avoid redundancy 923 - author_name = entry.author and entry.author.get("name") 924 - if author_name and author_name.lower() != display_name.lower(): 925 - author_info = f" (by {author_name})" 926 - else: 927 - author_info = "" 928 - 929 - published_info = "" 930 - if entry.published: 931 - published_info = ( 932 - f" • {entry.published.strftime('%Y-%m-%d')}" 933 - ) 934 - 935 - mention_info = f"@**{display_name}** posted{author_info}{published_info}:\n\n" 936 - 937 - # If no Zulip user found, use consistent format without @mention 938 - if not mention_info: 939 - user = self.git_store.get_user(username) if self.git_store else None 940 - display_name = user.display_name if user else username 941 - 942 - author_name = entry.author and entry.author.get("name") 943 - if author_name and author_name.lower() != display_name.lower(): 944 - author_info = f" (by {author_name})" 945 - else: 946 - author_info = "" 947 - 948 - published_info = "" 949 - if entry.published: 950 - published_info = f" • {entry.published.strftime('%Y-%m-%d')}" 951 - 952 - mention_info = ( 953 - f"**{display_name}** posted{author_info}{published_info}:\n\n" 954 - ) 955 - 956 - # Format the message with HTML processing 957 - message_lines = [ 958 - f"**{entry.title}**", 959 - f"🔗 {entry.link}", 960 - ] 961 - 962 - if entry.summary: 963 - # Process HTML in summary and truncate if needed 964 - processed_summary = self._process_html_content(entry.summary) 965 - if len(processed_summary) > 400: 966 - processed_summary = processed_summary[:397] + "..." 967 - message_lines.append(f"\n{processed_summary}") 968 - 969 - message_content = mention_info + "\n".join(message_lines) 970 - 971 - # Choose destination based on mode 972 - if self.debug_user and self.debug_zulip_user_id: 973 - # Debug mode: send DM 974 - debug_message = f"🐛 **DEBUG:** New article from thicket user `{username}`:\n\n{message_content}" 975 - 976 - # Ensure we have the numeric user ID 977 - user_id_to_use = self.debug_zulip_user_id 978 - if not user_id_to_use.isdigit(): 979 - # Need to look up the numeric ID 980 - resolved_id = self._lookup_zulip_user_id( 981 - bot_handler, user_id_to_use 982 - ) 983 - if resolved_id: 984 - user_id_to_use = resolved_id 985 - self.logger.debug( 986 - f"Resolved {self.debug_zulip_user_id} to user ID {user_id_to_use}" 987 - ) 988 - else: 989 - self.logger.error( 990 - f"Could not resolve user ID for {self.debug_zulip_user_id}" 991 - ) 992 - return 993 - 994 - try: 995 - # For private messages, user_id needs to be an integer, not string 996 - user_id_int = int(user_id_to_use) 997 - bot_handler.send_message( 998 - { 999 - "type": "private", 1000 - "to": [user_id_int], # Use integer user ID 1001 - "content": debug_message, 1002 - } 1003 - ) 1004 - except ValueError: 1005 - # If conversion to int fails, user_id_to_use might be an email 1006 - try: 1007 - bot_handler.send_message( 1008 - { 1009 - "type": "private", 1010 - "to": [user_id_to_use], # Try as string (email) 1011 - "content": debug_message, 1012 - } 1013 - ) 1014 - except Exception as e2: 1015 - self.logger.error( 1016 - f"Failed to send DM to {self.debug_user} (tried both int and string): {e2}" 1017 - ) 1018 - return 1019 - except Exception as e: 1020 - self.logger.error( 1021 - f"Failed to send DM to {self.debug_user} ({user_id_to_use}): {e}" 1022 - ) 1023 - return 1024 - self.logger.info( 1025 - f"Posted entry to debug user {self.debug_user}: {entry.title}" 1026 - ) 1027 - else: 1028 - # Normal mode: send to stream/topic 1029 - bot_handler.send_message( 1030 - { 1031 - "type": "stream", 1032 - "to": self.stream_name, 1033 - "subject": self.topic_name, 1034 - "content": message_content, 1035 - } 1036 - ) 1037 - self.logger.info( 1038 - f"Posted entry to stream: {entry.title} (user: {username})" 1039 - ) 1040 - 1041 - except Exception as e: 1042 - self.logger.error(f"Error posting entry to Zulip: {e}") 1043 - 1044 - def _process_html_content(self, html_content: str) -> str: 1045 - """Process HTML content from feeds to clean Zulip-compatible markdown.""" 1046 - if not html_content: 1047 - return "" 1048 - 1049 - try: 1050 - # Try to use markdownify for proper HTML to Markdown conversion 1051 - from markdownify import markdownify as md 1052 - 1053 - # Convert HTML to Markdown with compact settings for summaries 1054 - markdown = md( 1055 - html_content, 1056 - heading_style="ATX", # Use # for headings (but we'll post-process these) 1057 - bullets="-", # Use - for bullets 1058 - convert=[ 1059 - "a", 1060 - "b", 1061 - "strong", 1062 - "i", 1063 - "em", 1064 - "code", 1065 - "pre", 1066 - "p", 1067 - "br", 1068 - "ul", 1069 - "ol", 1070 - "li", 1071 - "h1", 1072 - "h2", 1073 - "h3", 1074 - "h4", 1075 - "h5", 1076 - "h6", 1077 - ], 1078 - ).strip() 1079 - 1080 - # Post-process to convert headings to bold for compact summaries 1081 - import re 1082 - 1083 - # Convert markdown headers to bold with period 1084 - markdown = re.sub( 1085 - r"^#{1,6}\s*(.+)$", r"**\1.**", markdown, flags=re.MULTILINE 1086 - ) 1087 - 1088 - # Clean up excessive newlines and make more compact 1089 - markdown = re.sub( 1090 - r"\n\s*\n\s*\n+", " ", markdown 1091 - ) # Multiple newlines become space 1092 - markdown = re.sub( 1093 - r"\n\s*\n", ". ", markdown 1094 - ) # Double newlines become sentence breaks 1095 - markdown = re.sub(r"\n", " ", markdown) # Single newlines become spaces 1096 - 1097 - # Clean up double periods and excessive whitespace 1098 - markdown = re.sub(r"\.\.+", ".", markdown) 1099 - markdown = re.sub(r"\s+", " ", markdown) 1100 - return markdown.strip() 1101 - 1102 - except ImportError: 1103 - # Fallback: manual HTML processing 1104 - import re 1105 - 1106 - content = html_content 1107 - 1108 - # Convert headings to bold with periods for compact summaries 1109 - content = re.sub( 1110 - r"<h[1-6](?:\s[^>]*)?>([^<]*)</h[1-6]>", 1111 - r"**\1.** ", 1112 - content, 1113 - flags=re.IGNORECASE, 1114 - ) 1115 - 1116 - # Convert common HTML elements to Markdown 1117 - content = re.sub( 1118 - r"<(?:strong|b)(?:\s[^>]*)?>([^<]*)</(?:strong|b)>", 1119 - r"**\1**", 1120 - content, 1121 - flags=re.IGNORECASE, 1122 - ) 1123 - content = re.sub( 1124 - r"<(?:em|i)(?:\s[^>]*)?>([^<]*)</(?:em|i)>", 1125 - r"*\1*", 1126 - content, 1127 - flags=re.IGNORECASE, 1128 - ) 1129 - content = re.sub( 1130 - r"<code(?:\s[^>]*)?>([^<]*)</code>", 1131 - r"`\1`", 1132 - content, 1133 - flags=re.IGNORECASE, 1134 - ) 1135 - content = re.sub( 1136 - r'<a(?:\s[^>]*?)?\s*href=["\']([^"\']*)["\'](?:\s[^>]*)?>([^<]*)</a>', 1137 - r"[\2](\1)", 1138 - content, 1139 - flags=re.IGNORECASE, 1140 - ) 1141 - 1142 - # Convert block elements to spaces instead of newlines for compactness 1143 - content = re.sub(r"<br\s*/?>", " ", content, flags=re.IGNORECASE) 1144 - content = re.sub(r"</p>\s*<p>", ". ", content, flags=re.IGNORECASE) 1145 - content = re.sub( 1146 - r"</?(?:p|div)(?:\s[^>]*)?>", " ", content, flags=re.IGNORECASE 1147 - ) 1148 - 1149 - # Remove remaining HTML tags 1150 - content = re.sub(r"<[^>]+>", "", content) 1151 - 1152 - # Clean up whitespace and make compact 1153 - content = re.sub( 1154 - r"\s+", " ", content 1155 - ) # Multiple whitespace becomes single space 1156 - content = re.sub( 1157 - r"\.\.+", ".", content 1158 - ) # Multiple periods become single period 1159 - return content.strip() 1160 - 1161 - except Exception as e: 1162 - self.logger.error(f"Error processing HTML content: {e}") 1163 - # Last resort: just strip HTML tags 1164 - import re 1165 - 1166 - return re.sub(r"<[^>]+>", "", html_content).strip() 1167 - 1168 - def _get_schedule_info(self) -> str: 1169 - """Get schedule information string.""" 1170 - lines = [] 1171 - 1172 - if self.last_sync_time: 1173 - import datetime 1174 - 1175 - last_sync = datetime.datetime.fromtimestamp(self.last_sync_time) 1176 - next_sync = last_sync + datetime.timedelta(seconds=self.sync_interval) 1177 - now = datetime.datetime.now() 1178 - 1179 - # Calculate time until next sync 1180 - time_until_next = next_sync - now 1181 - 1182 - if time_until_next.total_seconds() > 0: 1183 - minutes, seconds = divmod(int(time_until_next.total_seconds()), 60) 1184 - hours, minutes = divmod(minutes, 60) 1185 - 1186 - if hours > 0: 1187 - time_str = f"{hours}h {minutes}m {seconds}s" 1188 - elif minutes > 0: 1189 - time_str = f"{minutes}m {seconds}s" 1190 - else: 1191 - time_str = f"{seconds}s" 1192 - 1193 - lines.extend( 1194 - [ 1195 - f"🕐 **Last Sync:** {last_sync.strftime('%H:%M:%S')}", 1196 - f"⏰ **Next Sync:** {next_sync.strftime('%H:%M:%S')} (in {time_str})", 1197 - ] 1198 - ) 1199 - else: 1200 - lines.extend( 1201 - [ 1202 - f"🕐 **Last Sync:** {last_sync.strftime('%H:%M:%S')}", 1203 - f"⏰ **Next Sync:** Due now (running every {self.sync_interval}s)", 1204 - ] 1205 - ) 1206 - else: 1207 - lines.append("🕐 **Last Sync:** Never (bot starting up)") 1208 - 1209 - # Add sync frequency info 1210 - if self.sync_interval >= 3600: 1211 - frequency_str = ( 1212 - f"{self.sync_interval // 3600}h {(self.sync_interval % 3600) // 60}m" 1213 - ) 1214 - elif self.sync_interval >= 60: 1215 - frequency_str = f"{self.sync_interval // 60}m {self.sync_interval % 60}s" 1216 - else: 1217 - frequency_str = f"{self.sync_interval}s" 1218 - 1219 - lines.append(f"🔄 **Sync Frequency:** Every {frequency_str}") 1220 - 1221 - return "\n".join(lines) 1222 - 1223 - def _send_config_change_notification( 1224 - self, 1225 - bot_handler: BotHandler, 1226 - changer: str, 1227 - setting: str, 1228 - old_value: Optional[str], 1229 - new_value: str, 1230 - ) -> None: 1231 - """Send configuration change notification if enabled.""" 1232 - if not self.config_change_notifications or self.debug_user: 1233 - return 1234 - 1235 - # Don't send notification if stream/topic aren't configured yet 1236 - if not self.stream_name or not self.topic_name: 1237 - return 1238 - 1239 - try: 1240 - old_display = old_value if old_value else "(not set)" 1241 - notification_msg = ( 1242 - f"⚙️ **{changer}** changed {setting}: `{old_display}` → `{new_value}`" 1243 - ) 1244 - 1245 - bot_handler.send_message( 1246 - { 1247 - "type": "stream", 1248 - "to": self.stream_name, 1249 - "subject": self.topic_name, 1250 - "content": notification_msg, 1251 - } 1252 - ) 1253 - except Exception as e: 1254 - self.logger.error(f"Failed to send config change notification: {e}") 1255 - 1256 - 1257 - handler_class = ThicketBotHandler

+2 -24

src/thicket/cli/commands/__init__.py

··· 1 1 """CLI commands for thicket.""" 2 2 3 3 # Import all commands to register them with the main app 4 - from . import ( 5 - add, 6 - bot, 7 - duplicates, 8 - info_cmd, 9 - init, 10 - list_cmd, 11 - search, 12 - sync, 13 - upload, 14 - zulip, 15 - ) 4 + from . import add, duplicates, generate, index_cmd, info_cmd, init, links_cmd, list_cmd, sync 16 5 17 - __all__ = [ 18 - "add", 19 - "bot", 20 - "duplicates", 21 - "info_cmd", 22 - "init", 23 - "list_cmd", 24 - "search", 25 - "sync", 26 - "upload", 27 - "zulip", 28 - ] 6 + __all__ = ["add", "duplicates", "generate", "index_cmd", "info_cmd", "init", "links_cmd", "list_cmd", "sync"]

+46 -196

src/thicket/cli/commands/add.py

··· 1 1 """Add command for thicket.""" 2 2 3 - import asyncio 4 3 from pathlib import Path 5 4 from typing import Optional 6 5 7 6 import typer 8 - from pydantic import HttpUrl, ValidationError 7 + from pydantic import ValidationError 9 8 10 - from ...core.feed_parser import FeedParser 11 - from ...core.git_store import GitStore 12 - from ..main import app 13 - from ..utils import ( 14 - create_progress, 15 - load_config, 16 - print_error, 17 - print_info, 18 - print_success, 19 - ) 9 + from ..main import app, console, load_thicket 20 10 21 11 22 12 @app.command("add") 23 - def add_command( 24 - subcommand: str = typer.Argument(..., help="Subcommand: 'user' or 'feed'"), 13 + def add_user( 25 14 username: str = typer.Argument(..., help="Username"), 26 - feed_url: Optional[str] = typer.Argument( 27 - None, help="Feed URL (required for 'user' command)" 28 - ), 15 + feeds: list[str] = typer.Argument(..., help="Feed URLs"), 29 16 email: Optional[str] = typer.Option(None, "--email", "-e", help="User email"), 30 - homepage: Optional[str] = typer.Option( 31 - None, "--homepage", "-h", help="User homepage" 32 - ), 17 + homepage: Optional[str] = typer.Option(None, "--homepage", "-h", help="User homepage"), 33 18 icon: Optional[str] = typer.Option(None, "--icon", "-i", help="User icon URL"), 34 - display_name: Optional[str] = typer.Option( 35 - None, "--display-name", "-d", help="User display name" 36 - ), 19 + display_name: Optional[str] = typer.Option(None, "--display-name", "-d", help="User display name"), 37 20 config_file: Optional[Path] = typer.Option( 38 - Path("thicket.yaml"), "--config", help="Configuration file path" 39 - ), 40 - auto_discover: bool = typer.Option( 41 - True, 42 - "--auto-discover/--no-auto-discover", 43 - help="Auto-discover user metadata from feed", 21 + None, "--config", help="Configuration file path" 44 22 ), 45 23 ) -> None: 46 - """Add a user or feed to thicket.""" 47 - 48 - if subcommand == "user": 49 - add_user( 50 - username, 51 - feed_url, 52 - email, 53 - homepage, 54 - icon, 55 - display_name, 56 - config_file, 57 - auto_discover, 58 - ) 59 - elif subcommand == "feed": 60 - add_feed(username, feed_url, config_file) 61 - else: 62 - print_error(f"Unknown subcommand: {subcommand}") 63 - print_error("Use 'user' or 'feed'") 64 - raise typer.Exit(1) 65 - 66 - 67 - def add_user( 68 - username: str, 69 - feed_url: Optional[str], 70 - email: Optional[str], 71 - homepage: Optional[str], 72 - icon: Optional[str], 73 - display_name: Optional[str], 74 - config_file: Path, 75 - auto_discover: bool, 76 - ) -> None: 77 - """Add a new user with feed.""" 78 - 79 - if not feed_url: 80 - print_error("Feed URL is required when adding a user") 81 - raise typer.Exit(1) 82 - 83 - # Validate feed URL 84 - try: 85 - validated_feed_url = HttpUrl(feed_url) 86 - except ValidationError: 87 - print_error(f"Invalid feed URL: {feed_url}") 88 - raise typer.Exit(1) from None 89 - 90 - # Load configuration 91 - config = load_config(config_file) 92 - 93 - # Initialize Git store 94 - git_store = GitStore(config.git_store) 95 - 96 - # Check if user already exists 97 - existing_user = git_store.get_user(username) 98 - if existing_user: 99 - print_error(f"User '{username}' already exists") 100 - print_error("Use 'thicket add feed' to add additional feeds") 101 - raise typer.Exit(1) 102 - 103 - # Auto-discover metadata if enabled 104 - discovered_metadata = None 105 - if auto_discover: 106 - discovered_metadata = asyncio.run(discover_feed_metadata(validated_feed_url)) 107 - 108 - # Prepare user data with manual overrides taking precedence 109 - user_display_name = display_name or ( 110 - discovered_metadata.author_name or discovered_metadata.title 111 - if discovered_metadata 112 - else None 113 - ) 114 - user_email = email or ( 115 - discovered_metadata.author_email if discovered_metadata else None 116 - ) 117 - user_homepage = homepage or ( 118 - str(discovered_metadata.author_uri or discovered_metadata.link) 119 - if discovered_metadata 120 - else None 121 - ) 122 - user_icon = icon or ( 123 - str( 124 - discovered_metadata.logo 125 - or discovered_metadata.icon 126 - or discovered_metadata.image_url 127 - ) 128 - if discovered_metadata 129 - else None 130 - ) 131 - 132 - # Add user to Git store 133 - git_store.add_user( 134 - username=username, 135 - display_name=user_display_name, 136 - email=user_email, 137 - homepage=user_homepage, 138 - icon=user_icon, 139 - feeds=[str(validated_feed_url)], 140 - ) 141 - 142 - # Commit changes 143 - git_store.commit_changes(f"Add user: {username}") 144 - 145 - print_success(f"Added user '{username}' with feed: {feed_url}") 146 - 147 - if discovered_metadata and auto_discover: 148 - print_info("Auto-discovered metadata:") 149 - if user_display_name: 150 - print_info(f" Display name: {user_display_name}") 151 - if user_email: 152 - print_info(f" Email: {user_email}") 153 - if user_homepage: 154 - print_info(f" Homepage: {user_homepage}") 155 - if user_icon: 156 - print_info(f" Icon: {user_icon}") 157 - 158 - 159 - def add_feed(username: str, feed_url: Optional[str], config_file: Path) -> None: 160 - """Add a feed to an existing user.""" 161 - 162 - if not feed_url: 163 - print_error("Feed URL is required") 164 - raise typer.Exit(1) 165 - 166 - # Validate feed URL 24 + """Add a user with their feeds to thicket.""" 25 + 167 26 try: 168 - validated_feed_url = HttpUrl(feed_url) 169 - except ValidationError: 170 - print_error(f"Invalid feed URL: {feed_url}") 171 - raise typer.Exit(1) from None 172 - 173 - # Load configuration 174 - config = load_config(config_file) 175 - 176 - # Initialize Git store 177 - git_store = GitStore(config.git_store) 178 - 179 - # Check if user exists 180 - user = git_store.get_user(username) 181 - if not user: 182 - print_error(f"User '{username}' not found") 183 - print_error("Use 'thicket add user' to add a new user") 184 - raise typer.Exit(1) 185 - 186 - # Check if feed already exists 187 - if str(validated_feed_url) in user.feeds: 188 - print_error(f"Feed already exists for user '{username}': {feed_url}") 27 + # Load Thicket instance 28 + thicket = load_thicket(config_file) 29 + 30 + # Prepare user data 31 + user_data = {} 32 + if email: 33 + user_data['email'] = email 34 + if homepage: 35 + user_data['homepage'] = homepage 36 + if icon: 37 + user_data['icon'] = icon 38 + if display_name: 39 + user_data['display_name'] = display_name 40 + 41 + # Add the user 42 + user_config = thicket.add_user(username, feeds, **user_data) 43 + 44 + console.print(f"[green]✓[/green] Added user: {username}") 45 + console.print(f" • Display name: {user_config.display_name or 'None'}") 46 + console.print(f" • Email: {user_config.email or 'None'}") 47 + console.print(f" • Homepage: {user_config.homepage or 'None'}") 48 + console.print(f" • Feeds: {len(user_config.feeds)}") 49 + 50 + for feed in user_config.feeds: 51 + console.print(f" - {feed}") 52 + 53 + # Commit the addition 54 + commit_message = f"Add user {username} with {len(feeds)} feed(s)" 55 + if thicket.commit_changes(commit_message): 56 + console.print(f"[green]✓[/green] Committed: {commit_message}") 57 + else: 58 + console.print("[yellow]Warning:[/yellow] Failed to commit changes") 59 + 60 + except ValidationError as e: 61 + console.print(f"[red]Validation Error:[/red] {str(e)}") 189 62 raise typer.Exit(1) 190 - 191 - # Add feed to user 192 - updated_feeds = user.feeds + [str(validated_feed_url)] 193 - if git_store.update_user(username, feeds=updated_feeds): 194 - git_store.commit_changes(f"Add feed to user {username}: {feed_url}") 195 - print_success(f"Added feed to user '{username}': {feed_url}") 196 - else: 197 - print_error(f"Failed to add feed to user '{username}'") 63 + except Exception as e: 64 + console.print(f"[red]Error:[/red] {str(e)}") 198 65 raise typer.Exit(1) 199 66 200 - 201 - async def discover_feed_metadata(feed_url: HttpUrl): 202 - """Discover metadata from a feed URL.""" 203 - try: 204 - with create_progress() as progress: 205 - task = progress.add_task("Discovering feed metadata...", total=None) 206 - 207 - parser = FeedParser() 208 - content = await parser.fetch_feed(feed_url) 209 - metadata, _ = parser.parse_feed(content, feed_url) 210 - 211 - progress.update(task, completed=True) 212 - return metadata 213 - 214 - except Exception as e: 215 - print_error(f"Failed to discover feed metadata: {e}") 216 - return None

-247

src/thicket/cli/commands/bot.py

··· 1 - """Bot management commands for thicket.""" 2 - 3 - import subprocess 4 - import sys 5 - from pathlib import Path 6 - 7 - import typer 8 - from rich.console import Console 9 - 10 - from ..main import app 11 - from ..utils import print_error, print_info, print_success 12 - 13 - console = Console() 14 - 15 - 16 - @app.command() 17 - def bot( 18 - action: str = typer.Argument(..., help="Action: run, test, or status"), 19 - config_file: Path = typer.Option( 20 - Path("bot-config/zuliprc"), 21 - "--config", 22 - "-c", 23 - help="Zulip bot configuration file", 24 - ), 25 - thicket_config: Path = typer.Option( 26 - Path("thicket.yaml"), 27 - "--thicket-config", 28 - help="Path to thicket configuration file", 29 - ), 30 - daemon: bool = typer.Option( 31 - False, 32 - "--daemon", 33 - "-d", 34 - help="Run bot in daemon mode (background)", 35 - ), 36 - debug_user: str = typer.Option( 37 - None, 38 - "--debug-user", 39 - help="Debug mode: send DMs to this thicket username instead of posting to streams", 40 - ), 41 - ) -> None: 42 - """Manage the Thicket Zulip bot. 43 - 44 - Actions: 45 - - run: Start the Zulip bot 46 - - test: Test bot functionality 47 - - status: Show bot status 48 - """ 49 - 50 - if action == "run": 51 - _run_bot(config_file, thicket_config, daemon, debug_user) 52 - elif action == "test": 53 - _test_bot() 54 - elif action == "status": 55 - _bot_status(config_file) 56 - else: 57 - print_error(f"Unknown action: {action}") 58 - print_info("Available actions: run, test, status") 59 - raise typer.Exit(1) 60 - 61 - 62 - def _run_bot( 63 - config_file: Path, thicket_config: Path, daemon: bool, debug_user: str = None 64 - ) -> None: 65 - """Run the Zulip bot.""" 66 - if not config_file.exists(): 67 - print_error(f"Configuration file not found: {config_file}") 68 - print_info( 69 - f"Copy bot-config/zuliprc.template to {config_file} and configure it" 70 - ) 71 - print_info("See bot-config/README.md for setup instructions") 72 - raise typer.Exit(1) 73 - 74 - if not thicket_config.exists(): 75 - print_error(f"Thicket configuration file not found: {thicket_config}") 76 - print_info("Run `thicket init` to create a thicket.yaml file") 77 - raise typer.Exit(1) 78 - 79 - # Parse zuliprc to extract server URL 80 - zulip_site_url = _parse_zulip_config(config_file) 81 - 82 - print_info(f"Starting Thicket Zulip bot with config: {config_file}") 83 - print_info(f"Using thicket config: {thicket_config}") 84 - 85 - if debug_user: 86 - print_info( 87 - f"🐛 DEBUG MODE: Will send DMs to thicket user '{debug_user}' instead of posting to streams" 88 - ) 89 - 90 - if daemon: 91 - print_info("Running in daemon mode...") 92 - else: 93 - print_info("Bot will be available as @thicket in your Zulip chat") 94 - print_info("Press Ctrl+C to stop the bot") 95 - 96 - try: 97 - # Build the command 98 - cmd = [ 99 - sys.executable, 100 - "-m", 101 - "zulip_bots.run", 102 - "src/thicket/bots/thicket_bot.py", 103 - "--config-file", 104 - str(config_file), 105 - ] 106 - 107 - # Add environment variables for bot configuration 108 - import os 109 - 110 - env = os.environ.copy() 111 - 112 - # Always pass thicket config path 113 - env["THICKET_CONFIG_PATH"] = str(thicket_config.absolute()) 114 - 115 - # Add debug user if specified 116 - if debug_user: 117 - env["THICKET_DEBUG_USER"] = debug_user 118 - 119 - # Pass Zulip server URL to bot 120 - if zulip_site_url: 121 - env["THICKET_ZULIP_SITE_URL"] = zulip_site_url 122 - 123 - if daemon: 124 - # Run in background 125 - process = subprocess.Popen( 126 - cmd, 127 - stdout=subprocess.DEVNULL, 128 - stderr=subprocess.DEVNULL, 129 - start_new_session=True, 130 - env=env, 131 - ) 132 - print_success(f"Bot started in background with PID {process.pid}") 133 - else: 134 - # Run in foreground 135 - subprocess.run(cmd, check=True, env=env) 136 - 137 - except subprocess.CalledProcessError as e: 138 - print_error(f"Failed to start bot: {e}") 139 - raise typer.Exit(1) from e 140 - except KeyboardInterrupt: 141 - print_info("Bot stopped by user") 142 - 143 - 144 - def _parse_zulip_config(config_file: Path) -> str: 145 - """Parse zuliprc file to extract the site URL.""" 146 - try: 147 - import configparser 148 - 149 - config = configparser.ConfigParser() 150 - config.read(config_file) 151 - 152 - if "api" in config and "site" in config["api"]: 153 - site_url = config["api"]["site"] 154 - print_info(f"Detected Zulip server: {site_url}") 155 - return site_url 156 - else: 157 - print_error("Could not find 'site' in zuliprc [api] section") 158 - return "" 159 - 160 - except Exception as e: 161 - print_error(f"Error parsing zuliprc: {e}") 162 - return "" 163 - 164 - 165 - def _test_bot() -> None: 166 - """Test bot functionality.""" 167 - print_info("Testing Thicket Zulip bot...") 168 - 169 - try: 170 - from ...bots.test_bot import BotTester 171 - 172 - # Create bot tester 173 - tester = BotTester() 174 - 175 - # Test basic functionality 176 - console.print("✓ Testing help command...", style="green") 177 - responses = tester.send_command("help") 178 - assert len(responses) == 1 179 - assert "Thicket Feed Bot" in tester.get_last_response_content() 180 - 181 - console.print("✓ Testing status command...", style="green") 182 - responses = tester.send_command("status") 183 - assert len(responses) == 1 184 - assert "Status" in tester.get_last_response_content() 185 - 186 - console.print("✓ Testing config commands...", style="green") 187 - responses = tester.send_command("config stream test-stream") 188 - tester.assert_response_contains("Stream set to") 189 - 190 - responses = tester.send_command("config topic test-topic") 191 - tester.assert_response_contains("Topic set to") 192 - 193 - responses = tester.send_command("config interval 300") 194 - tester.assert_response_contains("Sync interval set to") 195 - 196 - print_success("All bot tests passed!") 197 - 198 - except Exception as e: 199 - print_error(f"Bot test failed: {e}") 200 - raise typer.Exit(1) from e 201 - 202 - 203 - def _bot_status(config_file: Path) -> None: 204 - """Show bot status.""" 205 - console.print("Thicket Zulip Bot Status", style="bold blue") 206 - console.print() 207 - 208 - # Check config file 209 - if config_file.exists(): 210 - console.print(f"✓ Config file: {config_file}", style="green") 211 - else: 212 - console.print(f"✗ Config file not found: {config_file}", style="red") 213 - console.print( 214 - " Copy bot-config/zuliprc.template and configure it", style="yellow" 215 - ) 216 - console.print( 217 - " See bot-config/README.md for setup instructions", style="yellow" 218 - ) 219 - 220 - # Check dependencies 221 - try: 222 - import zulip_bots 223 - 224 - version = getattr(zulip_bots, "__version__", "unknown") 225 - console.print(f"✓ zulip-bots version: {version}", style="green") 226 - except ImportError: 227 - console.print("✗ zulip-bots not installed", style="red") 228 - 229 - try: 230 - from ...bots.thicket_bot import ThicketBotHandler # noqa: F401 231 - 232 - console.print("✓ ThicketBotHandler available", style="green") 233 - except ImportError as e: 234 - console.print(f"✗ Bot handler not available: {e}", style="red") 235 - 236 - # Check bot file 237 - bot_file = Path("src/thicket/bots/thicket_bot.py") 238 - if bot_file.exists(): 239 - console.print(f"✓ Bot file: {bot_file}", style="green") 240 - else: 241 - console.print(f"✗ Bot file not found: {bot_file}", style="red") 242 - 243 - console.print() 244 - console.print("To run the bot:", style="bold") 245 - console.print(f" thicket bot run --config {config_file}") 246 - console.print() 247 - console.print("For help setting up the bot, see: docs/ZULIP_BOT.md", style="dim")

+3 -7

src/thicket/cli/commands/duplicates.py

··· 10 10 from ..main import app 11 11 from ..utils import ( 12 12 console, 13 - get_tsv_mode, 14 13 load_config, 15 14 print_error, 16 15 print_info, 17 16 print_success, 17 + get_tsv_mode, 18 18 ) 19 19 20 20 ··· 75 75 print_info(f"Total duplicates: {len(duplicates.duplicates)}") 76 76 77 77 78 - def add_duplicate( 79 - git_store: GitStore, duplicate_id: Optional[str], canonical_id: Optional[str] 80 - ) -> None: 78 + def add_duplicate(git_store: GitStore, duplicate_id: Optional[str], canonical_id: Optional[str]) -> None: 81 79 """Add a duplicate mapping.""" 82 80 if not duplicate_id: 83 81 print_error("Duplicate ID is required") ··· 126 124 # Remove the mapping 127 125 if git_store.remove_duplicate(duplicate_id): 128 126 # Commit changes 129 - git_store.commit_changes( 130 - f"Remove duplicate mapping: {duplicate_id} -> {canonical_id}" 131 - ) 127 + git_store.commit_changes(f"Remove duplicate mapping: {duplicate_id} -> {canonical_id}") 132 128 print_success(f"Removed duplicate mapping: {duplicate_id} -> {canonical_id}") 133 129 else: 134 130 print_error(f"Failed to remove duplicate mapping: {duplicate_id}")

+59

src/thicket/cli/commands/generate.py

··· 1 + """Generate static HTML website from thicket data.""" 2 + 3 + from pathlib import Path 4 + from typing import Optional 5 + 6 + import typer 7 + 8 + from ..main import app, console, load_thicket 9 + 10 + 11 + 12 + 13 + @app.command() 14 + def generate( 15 + output: Path = typer.Option( 16 + Path("./thicket-site"), 17 + "--output", 18 + "-o", 19 + help="Output directory for the generated website", 20 + ), 21 + template_dir: Optional[Path] = typer.Option( 22 + None, "--templates", help="Custom template directory" 23 + ), 24 + config_file: Optional[Path] = typer.Option( 25 + None, "--config", help="Configuration file path" 26 + ), 27 + ) -> None: 28 + """Generate a static HTML website from thicket data.""" 29 + 30 + try: 31 + # Load Thicket instance 32 + thicket = load_thicket(config_file) 33 + 34 + console.print(f"[blue]Generating static site to:[/blue] {output}") 35 + 36 + # Generate the complete site 37 + if thicket.generate_site(output, template_dir): 38 + console.print(f"[green]✓[/green] Successfully generated site at {output}") 39 + 40 + # Show what was generated 41 + stats = thicket.get_stats() 42 + console.print(f" • {stats.get('total_entries', 0)} entries") 43 + console.print(f" • {stats.get('total_users', 0)} users") 44 + console.print(f" • {stats.get('unique_urls', 0)} unique links") 45 + 46 + # List generated files 47 + if output.exists(): 48 + html_files = list(output.glob("*.html")) 49 + if html_files: 50 + console.print(" • Generated pages:") 51 + for html_file in sorted(html_files): 52 + console.print(f" - {html_file.name}") 53 + else: 54 + console.print("[red]✗[/red] Failed to generate site") 55 + raise typer.Exit(1) 56 + 57 + except Exception as e: 58 + console.print(f"[red]Error:[/red] {str(e)}") 59 + raise typer.Exit(1)

+427

src/thicket/cli/commands/index_cmd.py

··· 1 + """CLI command for building reference index from blog entries.""" 2 + 3 + import json 4 + from pathlib import Path 5 + from typing import Optional 6 + 7 + import typer 8 + from rich.console import Console 9 + from rich.progress import ( 10 + BarColumn, 11 + Progress, 12 + SpinnerColumn, 13 + TaskProgressColumn, 14 + TextColumn, 15 + ) 16 + from rich.table import Table 17 + 18 + from ...core.git_store import GitStore 19 + from ...core.reference_parser import ReferenceIndex, ReferenceParser 20 + from ..main import app 21 + from ..utils import get_tsv_mode, load_config 22 + 23 + console = Console() 24 + 25 + 26 + @app.command() 27 + def index( 28 + config_file: Optional[Path] = typer.Option( 29 + None, 30 + "--config", 31 + "-c", 32 + help="Path to configuration file", 33 + ), 34 + output_file: Optional[Path] = typer.Option( 35 + None, 36 + "--output", 37 + "-o", 38 + help="Path to output index file (default: updates links.json in git store)", 39 + ), 40 + verbose: bool = typer.Option( 41 + False, 42 + "--verbose", 43 + "-v", 44 + help="Show detailed progress information", 45 + ), 46 + ) -> None: 47 + """Build a reference index showing which blog entries reference others. 48 + 49 + This command analyzes all blog entries to detect cross-references between 50 + different blogs, creating an index that can be used to build threaded 51 + views of related content. 52 + 53 + Updates the unified links.json file with reference data. 54 + """ 55 + try: 56 + # Load configuration 57 + config = load_config(config_file) 58 + 59 + # Initialize Git store 60 + git_store = GitStore(config.git_store) 61 + 62 + # Initialize reference parser 63 + parser = ReferenceParser() 64 + 65 + # Build user domain mapping 66 + if verbose: 67 + console.print("Building user domain mapping...") 68 + user_domains = parser.build_user_domain_mapping(git_store) 69 + 70 + if verbose: 71 + console.print(f"Found {len(user_domains)} users with {sum(len(d) for d in user_domains.values())} total domains") 72 + 73 + # Initialize reference index 74 + ref_index = ReferenceIndex() 75 + ref_index.user_domains = user_domains 76 + 77 + # Get all users 78 + index = git_store._load_index() 79 + users = list(index.users.keys()) 80 + 81 + if not users: 82 + console.print("[yellow]No users found in Git store[/yellow]") 83 + raise typer.Exit(0) 84 + 85 + # Process all entries 86 + total_entries = 0 87 + total_references = 0 88 + all_references = [] 89 + 90 + with Progress( 91 + SpinnerColumn(), 92 + TextColumn("[progress.description]{task.description}"), 93 + BarColumn(), 94 + TaskProgressColumn(), 95 + console=console, 96 + ) as progress: 97 + 98 + # Count total entries first 99 + counting_task = progress.add_task("Counting entries...", total=len(users)) 100 + entry_counts = {} 101 + for username in users: 102 + entries = git_store.list_entries(username) 103 + entry_counts[username] = len(entries) 104 + total_entries += len(entries) 105 + progress.advance(counting_task) 106 + 107 + progress.remove_task(counting_task) 108 + 109 + # Process entries - extract references 110 + processing_task = progress.add_task( 111 + f"Extracting references from {total_entries} entries...", 112 + total=total_entries 113 + ) 114 + 115 + for username in users: 116 + entries = git_store.list_entries(username) 117 + 118 + for entry in entries: 119 + # Extract references from this entry 120 + references = parser.extract_references(entry, username, user_domains) 121 + all_references.extend(references) 122 + 123 + progress.advance(processing_task) 124 + 125 + if verbose and references: 126 + console.print(f" Found {len(references)} references in {username}:{entry.title[:50]}...") 127 + 128 + progress.remove_task(processing_task) 129 + 130 + # Resolve target_entry_ids for references 131 + if all_references: 132 + resolve_task = progress.add_task( 133 + f"Resolving {len(all_references)} references...", 134 + total=len(all_references) 135 + ) 136 + 137 + if verbose: 138 + console.print(f"Resolving target entry IDs for {len(all_references)} references...") 139 + 140 + resolved_references = parser.resolve_target_entry_ids(all_references, git_store) 141 + 142 + # Count resolved references 143 + resolved_count = sum(1 for ref in resolved_references if ref.target_entry_id is not None) 144 + if verbose: 145 + console.print(f"Resolved {resolved_count} out of {len(all_references)} references") 146 + 147 + # Add resolved references to index 148 + for ref in resolved_references: 149 + ref_index.add_reference(ref) 150 + total_references += 1 151 + progress.advance(resolve_task) 152 + 153 + progress.remove_task(resolve_task) 154 + 155 + # Determine output path 156 + if output_file: 157 + output_path = output_file 158 + else: 159 + output_path = config.git_store / "links.json" 160 + 161 + # Load existing links data or create new structure 162 + if output_path.exists() and not output_file: 163 + # Load existing unified structure 164 + with open(output_path) as f: 165 + existing_data = json.load(f) 166 + else: 167 + # Create new structure 168 + existing_data = { 169 + "links": {}, 170 + "reverse_mapping": {}, 171 + "user_domains": {} 172 + } 173 + 174 + # Update with reference data 175 + existing_data["references"] = ref_index.to_dict()["references"] 176 + existing_data["user_domains"] = {k: list(v) for k, v in user_domains.items()} 177 + 178 + # Save updated structure 179 + with open(output_path, "w") as f: 180 + json.dump(existing_data, f, indent=2, default=str) 181 + 182 + # Show summary 183 + if not get_tsv_mode(): 184 + console.print("\n[green]✓ Reference index built successfully[/green]") 185 + 186 + # Create summary table or TSV output 187 + if get_tsv_mode(): 188 + print("Metric\tCount") 189 + print(f"Total Users\t{len(users)}") 190 + print(f"Total Entries\t{total_entries}") 191 + print(f"Total References\t{total_references}") 192 + print(f"Outbound Refs\t{len(ref_index.outbound_refs)}") 193 + print(f"Inbound Refs\t{len(ref_index.inbound_refs)}") 194 + print(f"Output File\t{output_path}") 195 + else: 196 + table = Table(title="Reference Index Summary") 197 + table.add_column("Metric", style="cyan") 198 + table.add_column("Count", style="green") 199 + 200 + table.add_row("Total Users", str(len(users))) 201 + table.add_row("Total Entries", str(total_entries)) 202 + table.add_row("Total References", str(total_references)) 203 + table.add_row("Outbound Refs", str(len(ref_index.outbound_refs))) 204 + table.add_row("Inbound Refs", str(len(ref_index.inbound_refs))) 205 + table.add_row("Output File", str(output_path)) 206 + 207 + console.print(table) 208 + 209 + # Show some interesting statistics 210 + if total_references > 0: 211 + if not get_tsv_mode(): 212 + console.print("\n[bold]Reference Statistics:[/bold]") 213 + 214 + # Most referenced users 215 + target_counts = {} 216 + unresolved_domains = set() 217 + 218 + for ref in ref_index.references: 219 + if ref.target_username: 220 + target_counts[ref.target_username] = target_counts.get(ref.target_username, 0) + 1 221 + else: 222 + # Track unresolved domains 223 + from urllib.parse import urlparse 224 + domain = urlparse(ref.target_url).netloc.lower() 225 + unresolved_domains.add(domain) 226 + 227 + if target_counts: 228 + if get_tsv_mode(): 229 + print("Referenced User\tReference Count") 230 + for username, count in sorted(target_counts.items(), key=lambda x: x[1], reverse=True)[:5]: 231 + print(f"{username}\t{count}") 232 + else: 233 + console.print("\nMost referenced users:") 234 + for username, count in sorted(target_counts.items(), key=lambda x: x[1], reverse=True)[:5]: 235 + console.print(f" {username}: {count} references") 236 + 237 + if unresolved_domains and verbose: 238 + if get_tsv_mode(): 239 + print("Unresolved Domain\tCount") 240 + for domain in sorted(list(unresolved_domains)[:10]): 241 + print(f"{domain}\t1") 242 + if len(unresolved_domains) > 10: 243 + print(f"... and {len(unresolved_domains) - 10} more\t...") 244 + else: 245 + console.print(f"\nUnresolved domains: {len(unresolved_domains)}") 246 + for domain in sorted(list(unresolved_domains)[:10]): 247 + console.print(f" {domain}") 248 + if len(unresolved_domains) > 10: 249 + console.print(f" ... and {len(unresolved_domains) - 10} more") 250 + 251 + except Exception as e: 252 + console.print(f"[red]Error building reference index: {e}[/red]") 253 + if verbose: 254 + console.print_exception() 255 + raise typer.Exit(1) 256 + 257 + 258 + @app.command() 259 + def threads( 260 + config_file: Optional[Path] = typer.Option( 261 + None, 262 + "--config", 263 + "-c", 264 + help="Path to configuration file", 265 + ), 266 + index_file: Optional[Path] = typer.Option( 267 + None, 268 + "--index", 269 + "-i", 270 + help="Path to reference index file (default: links.json in git store)", 271 + ), 272 + username: Optional[str] = typer.Option( 273 + None, 274 + "--username", 275 + "-u", 276 + help="Show threads for specific username only", 277 + ), 278 + entry_id: Optional[str] = typer.Option( 279 + None, 280 + "--entry", 281 + "-e", 282 + help="Show thread for specific entry ID", 283 + ), 284 + min_size: int = typer.Option( 285 + 2, 286 + "--min-size", 287 + "-m", 288 + help="Minimum thread size to display", 289 + ), 290 + ) -> None: 291 + """Show threaded view of related blog entries. 292 + 293 + This command uses the reference index to show which blog entries 294 + are connected through cross-references, creating an email-style 295 + threaded view of the conversation. 296 + 297 + Reads reference data from the unified links.json file. 298 + """ 299 + try: 300 + # Load configuration 301 + config = load_config(config_file) 302 + 303 + # Determine index file path 304 + if index_file: 305 + index_path = index_file 306 + else: 307 + index_path = config.git_store / "links.json" 308 + 309 + if not index_path.exists(): 310 + console.print(f"[red]Links file not found: {index_path}[/red]") 311 + console.print("Run 'thicket links' and 'thicket index' first to build the reference index") 312 + raise typer.Exit(1) 313 + 314 + # Load unified data 315 + with open(index_path) as f: 316 + unified_data = json.load(f) 317 + 318 + # Check if references exist in the unified structure 319 + if "references" not in unified_data: 320 + console.print(f"[red]No references found in {index_path}[/red]") 321 + console.print("Run 'thicket index' first to build the reference index") 322 + raise typer.Exit(1) 323 + 324 + # Extract reference data and reconstruct ReferenceIndex 325 + ref_index = ReferenceIndex.from_dict({ 326 + "references": unified_data["references"], 327 + "user_domains": unified_data.get("user_domains", {}) 328 + }) 329 + 330 + # Initialize Git store to get entry details 331 + git_store = GitStore(config.git_store) 332 + 333 + if entry_id and username: 334 + # Show specific thread 335 + thread_members = ref_index.get_thread_members(username, entry_id) 336 + _display_thread(thread_members, ref_index, git_store, f"Thread for {username}:{entry_id}") 337 + 338 + elif username: 339 + # Show all threads involving this user 340 + user_index = git_store._load_index() 341 + user = user_index.get_user(username) 342 + if not user: 343 + console.print(f"[red]User not found: {username}[/red]") 344 + raise typer.Exit(1) 345 + 346 + entries = git_store.list_entries(username) 347 + threads_found = set() 348 + 349 + console.print(f"[bold]Threads involving {username}:[/bold]\n") 350 + 351 + for entry in entries: 352 + thread_members = ref_index.get_thread_members(username, entry.id) 353 + if len(thread_members) >= min_size: 354 + thread_key = tuple(sorted(thread_members)) 355 + if thread_key not in threads_found: 356 + threads_found.add(thread_key) 357 + _display_thread(thread_members, ref_index, git_store, f"Thread #{len(threads_found)}") 358 + 359 + else: 360 + # Show all threads 361 + console.print("[bold]All conversation threads:[/bold]\n") 362 + 363 + all_threads = set() 364 + processed_entries = set() 365 + 366 + # Get all entries 367 + user_index = git_store._load_index() 368 + for username in user_index.users.keys(): 369 + entries = git_store.list_entries(username) 370 + for entry in entries: 371 + entry_key = (username, entry.id) 372 + if entry_key in processed_entries: 373 + continue 374 + 375 + thread_members = ref_index.get_thread_members(username, entry.id) 376 + if len(thread_members) >= min_size: 377 + thread_key = tuple(sorted(thread_members)) 378 + if thread_key not in all_threads: 379 + all_threads.add(thread_key) 380 + _display_thread(thread_members, ref_index, git_store, f"Thread #{len(all_threads)}") 381 + 382 + # Mark all members as processed 383 + for member in thread_members: 384 + processed_entries.add(member) 385 + 386 + if not all_threads: 387 + console.print("[yellow]No conversation threads found[/yellow]") 388 + console.print(f"(minimum thread size: {min_size})") 389 + 390 + except Exception as e: 391 + console.print(f"[red]Error showing threads: {e}[/red]") 392 + raise typer.Exit(1) 393 + 394 + 395 + def _display_thread(thread_members, ref_index, git_store, title): 396 + """Display a single conversation thread.""" 397 + console.print(f"[bold cyan]{title}[/bold cyan]") 398 + console.print(f"Thread size: {len(thread_members)} entries") 399 + 400 + # Get entry details for each member 401 + thread_entries = [] 402 + for username, entry_id in thread_members: 403 + entry = git_store.get_entry(username, entry_id) 404 + if entry: 405 + thread_entries.append((username, entry)) 406 + 407 + # Sort by publication date 408 + thread_entries.sort(key=lambda x: x[1].published or x[1].updated) 409 + 410 + # Display entries 411 + for i, (username, entry) in enumerate(thread_entries): 412 + prefix = "├─" if i < len(thread_entries) - 1 else "└─" 413 + 414 + # Get references for this entry 415 + outbound = ref_index.get_outbound_refs(username, entry.id) 416 + inbound = ref_index.get_inbound_refs(username, entry.id) 417 + 418 + ref_info = "" 419 + if outbound or inbound: 420 + ref_info = f" ({len(outbound)} out, {len(inbound)} in)" 421 + 422 + console.print(f" {prefix} [{username}] {entry.title[:60]}...{ref_info}") 423 + 424 + if entry.published: 425 + console.print(f" Published: {entry.published.strftime('%Y-%m-%d')}") 426 + 427 + console.print() # Empty line after each thread

+119 -106

src/thicket/cli/commands/info_cmd.py

··· 1 1 """CLI command for displaying detailed information about a specific atom entry.""" 2 2 3 + import json 3 4 from pathlib import Path 4 5 from typing import Optional 5 6 ··· 7 8 from rich.console import Console 8 9 from rich.panel import Panel 9 10 from rich.table import Table 11 + from rich.text import Text 10 12 11 13 from ...core.git_store import GitStore 14 + from ...core.reference_parser import ReferenceIndex 12 15 from ..main import app 13 - from ..utils import get_tsv_mode, load_config 16 + from ..utils import load_config, get_tsv_mode 14 17 15 18 console = Console() 16 19 ··· 18 21 @app.command() 19 22 def info( 20 23 identifier: str = typer.Argument( 21 - ..., help="The atom ID or URL of the entry to display information about" 24 + ..., 25 + help="The atom ID or URL of the entry to display information about" 22 26 ), 23 27 username: Optional[str] = typer.Option( 24 28 None, 25 29 "--username", 26 30 "-u", 27 - help="Username to search for the entry (if not provided, searches all users)", 31 + help="Username to search for the entry (if not provided, searches all users)" 28 32 ), 29 33 config_file: Optional[Path] = typer.Option( 30 34 Path("thicket.yaml"), ··· 33 37 help="Path to configuration file", 34 38 ), 35 39 show_content: bool = typer.Option( 36 - False, "--content", help="Include the full content of the entry in the output" 40 + False, 41 + "--content", 42 + help="Include the full content of the entry in the output" 37 43 ), 38 44 ) -> None: 39 45 """Display detailed information about a specific atom entry. 40 - 46 + 41 47 You can specify the entry using either its atom ID or URL. 42 48 Shows all metadata for the given entry, including title, dates, categories, 43 49 and summarizes all inbound and outbound links to/from other posts. ··· 45 51 try: 46 52 # Load configuration 47 53 config = load_config(config_file) 48 - 54 + 49 55 # Initialize Git store 50 56 git_store = GitStore(config.git_store) 51 - 57 + 52 58 # Find the entry 53 59 entry = None 54 60 found_username = None 55 - 61 + 56 62 # Check if identifier looks like a URL 57 - is_url = identifier.startswith(("http://", "https://")) 58 - 63 + is_url = identifier.startswith(('http://', 'https://')) 64 + 59 65 if username: 60 66 # Search specific username 61 67 if is_url: ··· 91 97 if entry: 92 98 found_username = user 93 99 break 94 - 100 + 95 101 if not entry or not found_username: 96 102 if username: 97 - console.print( 98 - f"[red]Entry with {'URL' if is_url else 'atom ID'} '{identifier}' not found for user '{username}'[/red]" 99 - ) 103 + console.print(f"[red]Entry with {'URL' if is_url else 'atom ID'} '{identifier}' not found for user '{username}'[/red]") 100 104 else: 101 - console.print( 102 - f"[red]Entry with {'URL' if is_url else 'atom ID'} '{identifier}' not found in any user's entries[/red]" 103 - ) 105 + console.print(f"[red]Entry with {'URL' if is_url else 'atom ID'} '{identifier}' not found in any user's entries[/red]") 104 106 raise typer.Exit(1) 105 - 107 + 108 + # Load reference index if available 109 + links_path = config.git_store / "links.json" 110 + ref_index = None 111 + if links_path.exists(): 112 + with open(links_path) as f: 113 + unified_data = json.load(f) 114 + 115 + # Check if references exist in the unified structure 116 + if "references" in unified_data: 117 + ref_index = ReferenceIndex.from_dict({ 118 + "references": unified_data["references"], 119 + "user_domains": unified_data.get("user_domains", {}) 120 + }) 121 + 106 122 # Display information 107 123 if get_tsv_mode(): 108 - _display_entry_info_tsv(entry, found_username, show_content) 124 + _display_entry_info_tsv(entry, found_username, ref_index, show_content) 109 125 else: 110 126 _display_entry_info(entry, found_username) 111 - 112 - # Display links and backlinks from entry fields 113 - _display_link_info(entry, found_username, git_store) 114 - 127 + 128 + if ref_index: 129 + _display_link_info(entry, found_username, ref_index) 130 + else: 131 + console.print("\n[yellow]No reference index found. Run 'thicket links' and 'thicket index' to build cross-reference data.[/yellow]") 132 + 115 133 # Optionally display content 116 134 if show_content and entry.content: 117 135 _display_content(entry.content) 118 - 136 + 119 137 except Exception as e: 120 138 console.print(f"[red]Error displaying entry info: {e}[/red]") 121 - raise typer.Exit(1) from e 139 + raise typer.Exit(1) 122 140 123 141 124 142 def _display_entry_info(entry, username: str) -> None: 125 143 """Display basic entry information in a structured format.""" 126 - 144 + 127 145 # Create main info panel 128 146 info_table = Table.grid(padding=(0, 2)) 129 147 info_table.add_column("Field", style="cyan bold", width=15) 130 148 info_table.add_column("Value", style="white") 131 - 149 + 132 150 info_table.add_row("User", f"[green]{username}[/green]") 133 151 info_table.add_row("Atom ID", f"[blue]{entry.id}[/blue]") 134 152 info_table.add_row("Title", entry.title) 135 153 info_table.add_row("Link", str(entry.link)) 136 - 154 + 137 155 if entry.published: 138 - info_table.add_row( 139 - "Published", entry.published.strftime("%Y-%m-%d %H:%M:%S UTC") 140 - ) 141 - 156 + info_table.add_row("Published", entry.published.strftime("%Y-%m-%d %H:%M:%S UTC")) 157 + 142 158 info_table.add_row("Updated", entry.updated.strftime("%Y-%m-%d %H:%M:%S UTC")) 143 - 159 + 144 160 if entry.summary: 145 161 # Truncate long summaries 146 - summary = ( 147 - entry.summary[:200] + "..." if len(entry.summary) > 200 else entry.summary 148 - ) 162 + summary = entry.summary[:200] + "..." if len(entry.summary) > 200 else entry.summary 149 163 info_table.add_row("Summary", summary) 150 - 164 + 151 165 if entry.categories: 152 166 categories_text = ", ".join(entry.categories) 153 167 info_table.add_row("Categories", categories_text) 154 - 168 + 155 169 if entry.author: 156 170 author_info = [] 157 171 if "name" in entry.author: ··· 160 174 author_info.append(f"<{entry.author['email']}>") 161 175 if author_info: 162 176 info_table.add_row("Author", " ".join(author_info)) 163 - 177 + 164 178 if entry.content_type: 165 179 info_table.add_row("Content Type", entry.content_type) 166 - 180 + 167 181 if entry.rights: 168 182 info_table.add_row("Rights", entry.rights) 169 - 183 + 170 184 if entry.source: 171 185 info_table.add_row("Source Feed", entry.source) 172 - 186 + 173 187 panel = Panel( 174 - info_table, title="[bold]Entry Information[/bold]", border_style="blue" 188 + info_table, 189 + title=f"[bold]Entry Information[/bold]", 190 + border_style="blue" 175 191 ) 176 - 192 + 177 193 console.print(panel) 178 194 179 195 180 - def _display_link_info(entry, username: str, git_store: GitStore) -> None: 196 + def _display_link_info(entry, username: str, ref_index: ReferenceIndex) -> None: 181 197 """Display inbound and outbound link information.""" 182 - 183 - # Get links from entry fields 184 - outbound_links = getattr(entry, "links", []) 185 - backlinks = getattr(entry, "backlinks", []) 186 - 187 - if not outbound_links and not backlinks: 198 + 199 + # Get links 200 + outbound_refs = ref_index.get_outbound_refs(username, entry.id) 201 + inbound_refs = ref_index.get_inbound_refs(username, entry.id) 202 + 203 + if not outbound_refs and not inbound_refs: 188 204 console.print("\n[dim]No cross-references found for this entry.[/dim]") 189 205 return 190 - 206 + 191 207 # Create links table 192 208 links_table = Table(title="Cross-References") 193 209 links_table.add_column("Direction", style="cyan", width=10) 194 - links_table.add_column("Target/Source", style="green", width=30) 195 - links_table.add_column("URL/ID", style="blue", width=60) 196 - 197 - # Add outbound links 198 - for link in outbound_links: 199 - links_table.add_row("→ Out", "External/Other", link) 200 - 201 - # Add backlinks (inbound references) 202 - for backlink_id in backlinks: 203 - # Try to find which user this entry belongs to 204 - source_info = backlink_id 205 - # Could enhance this by looking up the actual entry to get username 206 - links_table.add_row("← In", "Entry", source_info) 207 - 210 + links_table.add_column("Target/Source", style="green", width=20) 211 + links_table.add_column("URL", style="blue", width=50) 212 + 213 + # Add outbound references 214 + for ref in outbound_refs: 215 + target_info = f"{ref.target_username}:{ref.target_entry_id}" if ref.target_username and ref.target_entry_id else "External" 216 + links_table.add_row("→ Out", target_info, ref.target_url) 217 + 218 + # Add inbound references 219 + for ref in inbound_refs: 220 + source_info = f"{ref.source_username}:{ref.source_entry_id}" 221 + links_table.add_row("← In", source_info, ref.target_url) 222 + 208 223 console.print() 209 224 console.print(links_table) 210 - 225 + 211 226 # Summary 212 - console.print( 213 - f"\n[bold]Summary:[/bold] {len(outbound_links)} outbound links, {len(backlinks)} inbound backlinks" 214 - ) 227 + console.print(f"\n[bold]Summary:[/bold] {len(outbound_refs)} outbound, {len(inbound_refs)} inbound references") 215 228 216 229 217 230 def _display_content(content: str) -> None: 218 231 """Display the full content of the entry.""" 219 - 232 + 220 233 # Truncate very long content 221 234 display_content = content 222 235 if len(content) > 5000: 223 236 display_content = content[:5000] + "\n\n[... content truncated ...]" 224 - 237 + 225 238 panel = Panel( 226 239 display_content, 227 240 title="[bold]Entry Content[/bold]", 228 241 border_style="green", 229 - expand=False, 242 + expand=False 230 243 ) 231 - 244 + 232 245 console.print() 233 246 console.print(panel) 234 247 235 248 236 - def _display_entry_info_tsv(entry, username: str, show_content: bool) -> None: 249 + def _display_entry_info_tsv(entry, username: str, ref_index: Optional[ReferenceIndex], show_content: bool) -> None: 237 250 """Display entry information in TSV format.""" 238 - 251 + 239 252 # Basic info 240 253 print("Field\tValue") 241 254 print(f"User\t{username}") 242 255 print(f"Atom ID\t{entry.id}") 243 - print( 244 - f"Title\t{entry.title.replace(chr(9), ' ').replace(chr(10), ' ').replace(chr(13), ' ')}" 245 - ) 256 + print(f"Title\t{entry.title.replace(chr(9), ' ').replace(chr(10), ' ').replace(chr(13), ' ')}") 246 257 print(f"Link\t{entry.link}") 247 - 258 + 248 259 if entry.published: 249 260 print(f"Published\t{entry.published.strftime('%Y-%m-%d %H:%M:%S UTC')}") 250 - 261 + 251 262 print(f"Updated\t{entry.updated.strftime('%Y-%m-%d %H:%M:%S UTC')}") 252 - 263 + 253 264 if entry.summary: 254 265 # Escape tabs and newlines in summary 255 - summary = entry.summary.replace("\t", " ").replace("\n", " ").replace("\r", " ") 266 + summary = entry.summary.replace('\t', ' ').replace('\n', ' ').replace('\r', ' ') 256 267 print(f"Summary\t{summary}") 257 - 268 + 258 269 if entry.categories: 259 270 print(f"Categories\t{', '.join(entry.categories)}") 260 - 271 + 261 272 if entry.author: 262 273 author_info = [] 263 274 if "name" in entry.author: ··· 266 277 author_info.append(f"<{entry.author['email']}>") 267 278 if author_info: 268 279 print(f"Author\t{' '.join(author_info)}") 269 - 280 + 270 281 if entry.content_type: 271 282 print(f"Content Type\t{entry.content_type}") 272 - 283 + 273 284 if entry.rights: 274 285 print(f"Rights\t{entry.rights}") 275 - 286 + 276 287 if entry.source: 277 288 print(f"Source Feed\t{entry.source}") 278 - 279 - # Add links info from entry fields 280 - outbound_links = getattr(entry, "links", []) 281 - backlinks = getattr(entry, "backlinks", []) 282 - 283 - if outbound_links or backlinks: 284 - print(f"Outbound Links\t{len(outbound_links)}") 285 - print(f"Backlinks\t{len(backlinks)}") 286 - 287 - # Show each link 288 - for link in outbound_links: 289 - print(f"→ Link\t{link}") 290 - 291 - for backlink_id in backlinks: 292 - print(f"← Backlink\t{backlink_id}") 293 - 289 + 290 + # Add reference info if available 291 + if ref_index: 292 + outbound_refs = ref_index.get_outbound_refs(username, entry.id) 293 + inbound_refs = ref_index.get_inbound_refs(username, entry.id) 294 + 295 + print(f"Outbound References\t{len(outbound_refs)}") 296 + print(f"Inbound References\t{len(inbound_refs)}") 297 + 298 + # Show each reference 299 + for ref in outbound_refs: 300 + target_info = f"{ref.target_username}:{ref.target_entry_id}" if ref.target_username and ref.target_entry_id else "External" 301 + print(f"Outbound Reference\t{target_info}\t{ref.target_url}") 302 + 303 + for ref in inbound_refs: 304 + source_info = f"{ref.source_username}:{ref.source_entry_id}" 305 + print(f"Inbound Reference\t{source_info}\t{ref.target_url}") 306 + 294 307 # Show content if requested 295 308 if show_content and entry.content: 296 309 # Escape tabs and newlines in content 297 - content = entry.content.replace("\t", " ").replace("\n", " ").replace("\r", " ") 298 - print(f"Content\t{content}") 310 + content = entry.content.replace('\t', ' ').replace('\n', ' ').replace('\r', ' ') 311 + print(f"Content\t{content}")

+51 -39

src/thicket/cli/commands/init.py

··· 1 1 """Initialize command for thicket.""" 2 2 3 + import yaml 3 4 from pathlib import Path 4 5 from typing import Optional 5 6 6 7 import typer 7 - from pydantic import ValidationError 8 8 9 - from ...core.git_store import GitStore 9 + from ..main import app, console, get_config_path 10 10 from ...models import ThicketConfig 11 - from ..main import app 12 - from ..utils import print_error, print_success, save_config 11 + from ... import Thicket 13 12 14 13 15 14 @app.command() 16 15 def init( 17 - git_store: Path = typer.Argument( 18 - ..., help="Path to Git repository for storing feeds" 19 - ), 16 + git_store: Path = typer.Argument(..., help="Path to Git repository for storing feeds"), 20 17 cache_dir: Optional[Path] = typer.Option( 21 18 None, "--cache-dir", "-c", help="Cache directory (default: ~/.cache/thicket)" 22 19 ), 23 20 config_file: Optional[Path] = typer.Option( 24 - None, "--config", help="Configuration file path (default: thicket.yaml)" 21 + None, "--config", help="Configuration file path (default: ~/.config/thicket/config.yaml)" 25 22 ), 26 23 force: bool = typer.Option( 27 24 False, "--force", "-f", help="Overwrite existing configuration" ··· 31 28 32 29 # Set default paths 33 30 if cache_dir is None: 34 - from platformdirs import user_cache_dir 35 - 36 - cache_dir = Path(user_cache_dir("thicket")) 31 + cache_dir = Path.home() / ".cache" / "thicket" 37 32 38 33 if config_file is None: 39 - config_file = Path("thicket.yaml") 34 + config_file = get_config_path() 40 35 41 36 # Check if config already exists 42 37 if config_file.exists() and not force: 43 - print_error(f"Configuration file already exists: {config_file}") 44 - print_error("Use --force to overwrite") 38 + console.print(f"[red]Configuration file already exists:[/red] {config_file}") 39 + console.print("Use --force to overwrite") 45 40 raise typer.Exit(1) 46 41 47 - # Create cache directory 48 - cache_dir.mkdir(parents=True, exist_ok=True) 49 - 50 - # Create Git store 51 42 try: 52 - GitStore(git_store) 53 - print_success(f"Initialized Git store at: {git_store}") 54 - except Exception as e: 55 - print_error(f"Failed to initialize Git store: {e}") 56 - raise typer.Exit(1) from e 43 + # Create directories 44 + git_store.mkdir(parents=True, exist_ok=True) 45 + cache_dir.mkdir(parents=True, exist_ok=True) 46 + config_file.parent.mkdir(parents=True, exist_ok=True) 57 47 58 - # Create configuration 59 - try: 60 - config = ThicketConfig(git_store=git_store, cache_dir=cache_dir, users=[]) 48 + # Create Thicket instance with minimal config 49 + thicket = Thicket.create(git_store, cache_dir) 50 + 51 + # Initialize the repository 52 + if thicket.init_repository(): 53 + console.print(f"[green]✓[/green] Initialized Git store at: {git_store}") 54 + else: 55 + console.print(f"[red]✗[/red] Failed to initialize Git store") 56 + raise typer.Exit(1) 61 57 62 - save_config(config, config_file) 63 - print_success(f"Created configuration file: {config_file}") 58 + # Save configuration 59 + config_data = { 60 + 'git_store': str(git_store), 61 + 'cache_dir': str(cache_dir), 62 + 'users': [] 63 + } 64 + 65 + with open(config_file, 'w') as f: 66 + yaml.dump(config_data, f, default_flow_style=False) 67 + 68 + console.print(f"[green]✓[/green] Created configuration file: {config_file}") 69 + 70 + # Create initial commit 71 + if thicket.commit_changes("Initialize thicket repository"): 72 + console.print("[green]✓[/green] Created initial commit") 73 + 74 + console.print("\n[green]Thicket initialized successfully![/green]") 75 + console.print(f" • Git store: {git_store}") 76 + console.print(f" • Cache directory: {cache_dir}") 77 + console.print(f" • Configuration: {config_file}") 78 + console.print("\n[blue]Next steps:[/blue]") 79 + console.print(" 1. Add your first user and feed:") 80 + console.print(f" [cyan]thicket add username https://example.com/feed.xml[/cyan]") 81 + console.print(" 2. Sync feeds:") 82 + console.print(f" [cyan]thicket sync[/cyan]") 83 + console.print(" 3. Generate a website:") 84 + console.print(f" [cyan]thicket generate[/cyan]") 64 85 65 - except ValidationError as e: 66 - print_error(f"Invalid configuration: {e}") 67 - raise typer.Exit(1) from e 68 86 except Exception as e: 69 - print_error(f"Failed to create configuration: {e}") 70 - raise typer.Exit(1) from e 71 - 72 - print_success("Thicket initialized successfully!") 73 - print_success(f"Git store: {git_store}") 74 - print_success(f"Cache directory: {cache_dir}") 75 - print_success(f"Configuration: {config_file}") 76 - print_success("Run 'thicket add user' to add your first user and feed.") 87 + console.print(f"[red]Error:[/red] {str(e)}") 88 + raise typer.Exit(1)

+416

src/thicket/cli/commands/links_cmd.py

··· 1 + """CLI command for extracting and categorizing all outbound links from blog entries.""" 2 + 3 + import json 4 + import re 5 + from pathlib import Path 6 + from typing import Dict, List, Optional, Set 7 + from urllib.parse import urljoin, urlparse 8 + 9 + import typer 10 + from rich.console import Console 11 + from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn, TaskProgressColumn 12 + from rich.table import Table 13 + 14 + from ...core.git_store import GitStore 15 + from ..main import app 16 + from ..utils import load_config, get_tsv_mode 17 + 18 + console = Console() 19 + 20 + 21 + class LinkData: 22 + """Represents a link found in a blog entry.""" 23 + 24 + def __init__(self, url: str, entry_id: str, username: str): 25 + self.url = url 26 + self.entry_id = entry_id 27 + self.username = username 28 + 29 + def to_dict(self) -> dict: 30 + """Convert to dictionary for JSON serialization.""" 31 + return { 32 + "url": self.url, 33 + "entry_id": self.entry_id, 34 + "username": self.username 35 + } 36 + 37 + @classmethod 38 + def from_dict(cls, data: dict) -> "LinkData": 39 + """Create from dictionary.""" 40 + return cls( 41 + url=data["url"], 42 + entry_id=data["entry_id"], 43 + username=data["username"] 44 + ) 45 + 46 + 47 + class LinkCategorizer: 48 + """Categorizes links as internal, user, or unknown.""" 49 + 50 + def __init__(self, user_domains: Dict[str, Set[str]]): 51 + self.user_domains = user_domains 52 + # Create reverse mapping of domain -> username 53 + self.domain_to_user = {} 54 + for username, domains in user_domains.items(): 55 + for domain in domains: 56 + self.domain_to_user[domain] = username 57 + 58 + def categorize_url(self, url: str, source_username: str) -> tuple[str, Optional[str]]: 59 + """ 60 + Categorize a URL as 'internal', 'user', or 'unknown'. 61 + Returns (category, target_username). 62 + """ 63 + try: 64 + parsed = urlparse(url) 65 + domain = parsed.netloc.lower() 66 + 67 + # Check if it's a link to the same user's domain (internal) 68 + if domain in self.user_domains.get(source_username, set()): 69 + return "internal", source_username 70 + 71 + # Check if it's a link to another user's domain 72 + if domain in self.domain_to_user: 73 + return "user", self.domain_to_user[domain] 74 + 75 + # Everything else is unknown 76 + return "unknown", None 77 + 78 + except Exception: 79 + return "unknown", None 80 + 81 + 82 + class LinkExtractor: 83 + """Extracts and resolves links from blog entries.""" 84 + 85 + def __init__(self): 86 + # Pattern for extracting links from HTML 87 + self.link_pattern = re.compile(r'<a[^>]+href="([^"]+)"[^>]*>(.*?)</a>', re.IGNORECASE | re.DOTALL) 88 + self.url_pattern = re.compile(r'https?://[^\s<>"]+') 89 + 90 + def extract_links_from_html(self, html_content: str, base_url: str) -> List[tuple[str, str]]: 91 + """Extract all links from HTML content and resolve them against base URL.""" 92 + links = [] 93 + 94 + # Extract links from <a> tags 95 + for match in self.link_pattern.finditer(html_content): 96 + url = match.group(1) 97 + text = re.sub(r'<[^>]+>', '', match.group(2)).strip() # Remove HTML tags from link text 98 + 99 + # Resolve relative URLs against base URL 100 + resolved_url = urljoin(base_url, url) 101 + links.append((resolved_url, text)) 102 + 103 + return links 104 + 105 + 106 + def extract_links_from_entry(self, entry, username: str, base_url: str) -> List[LinkData]: 107 + """Extract all links from a blog entry.""" 108 + links = [] 109 + 110 + # Combine all text content for analysis 111 + content_to_search = [] 112 + if entry.content: 113 + content_to_search.append(entry.content) 114 + if entry.summary: 115 + content_to_search.append(entry.summary) 116 + 117 + for content in content_to_search: 118 + extracted_links = self.extract_links_from_html(content, base_url) 119 + 120 + for url, link_text in extracted_links: 121 + # Skip empty URLs 122 + if not url or url.startswith('#'): 123 + continue 124 + 125 + link_data = LinkData( 126 + url=url, 127 + entry_id=entry.id, 128 + username=username 129 + ) 130 + 131 + links.append(link_data) 132 + 133 + return links 134 + 135 + 136 + @app.command() 137 + def links( 138 + config_file: Optional[Path] = typer.Option( 139 + Path("thicket.yaml"), 140 + "--config", 141 + "-c", 142 + help="Path to configuration file", 143 + ), 144 + output_file: Optional[Path] = typer.Option( 145 + None, 146 + "--output", 147 + "-o", 148 + help="Path to output unified links file (default: links.json in git store)", 149 + ), 150 + verbose: bool = typer.Option( 151 + False, 152 + "--verbose", 153 + "-v", 154 + help="Show detailed progress information", 155 + ), 156 + ) -> None: 157 + """Extract and categorize all outbound links from blog entries. 158 + 159 + This command analyzes all blog entries to extract outbound links, 160 + resolve them properly with respect to the feed's base URL, and 161 + categorize them as internal, user, or unknown links. 162 + 163 + Creates a unified links.json file containing all link data. 164 + """ 165 + try: 166 + # Load configuration 167 + config = load_config(config_file) 168 + 169 + # Initialize Git store 170 + git_store = GitStore(config.git_store) 171 + 172 + # Build user domain mapping 173 + if verbose: 174 + console.print("Building user domain mapping...") 175 + 176 + index = git_store._load_index() 177 + user_domains = {} 178 + 179 + for username, user_metadata in index.users.items(): 180 + domains = set() 181 + 182 + # Add domains from feeds 183 + for feed_url in user_metadata.feeds: 184 + domain = urlparse(feed_url).netloc.lower() 185 + if domain: 186 + domains.add(domain) 187 + 188 + # Add domain from homepage 189 + if user_metadata.homepage: 190 + domain = urlparse(str(user_metadata.homepage)).netloc.lower() 191 + if domain: 192 + domains.add(domain) 193 + 194 + user_domains[username] = domains 195 + 196 + if verbose: 197 + console.print(f"Found {len(user_domains)} users with {sum(len(d) for d in user_domains.values())} total domains") 198 + 199 + # Initialize components 200 + link_extractor = LinkExtractor() 201 + categorizer = LinkCategorizer(user_domains) 202 + 203 + # Get all users 204 + users = list(index.users.keys()) 205 + 206 + if not users: 207 + console.print("[yellow]No users found in Git store[/yellow]") 208 + raise typer.Exit(0) 209 + 210 + # Process all entries 211 + all_links = [] 212 + link_categories = {"internal": [], "user": [], "unknown": []} 213 + link_dict = {} # Dictionary with link URL as key, maps to list of atom IDs 214 + reverse_dict = {} # Dictionary with atom ID as key, maps to list of URLs 215 + 216 + with Progress( 217 + SpinnerColumn(), 218 + TextColumn("[progress.description]{task.description}"), 219 + BarColumn(), 220 + TaskProgressColumn(), 221 + console=console, 222 + ) as progress: 223 + 224 + # Count total entries first 225 + counting_task = progress.add_task("Counting entries...", total=len(users)) 226 + total_entries = 0 227 + 228 + for username in users: 229 + entries = git_store.list_entries(username) 230 + total_entries += len(entries) 231 + progress.advance(counting_task) 232 + 233 + progress.remove_task(counting_task) 234 + 235 + # Process entries 236 + processing_task = progress.add_task( 237 + f"Processing {total_entries} entries...", 238 + total=total_entries 239 + ) 240 + 241 + for username in users: 242 + entries = git_store.list_entries(username) 243 + user_metadata = index.users[username] 244 + 245 + # Get base URL for this user (use first feed URL) 246 + base_url = str(user_metadata.feeds[0]) if user_metadata.feeds else "https://example.com" 247 + 248 + for entry in entries: 249 + # Extract links from this entry 250 + entry_links = link_extractor.extract_links_from_entry(entry, username, base_url) 251 + 252 + # Track unique links per entry 253 + entry_urls_seen = set() 254 + 255 + # Categorize each link 256 + for link_data in entry_links: 257 + # Skip if we've already seen this URL in this entry 258 + if link_data.url in entry_urls_seen: 259 + continue 260 + entry_urls_seen.add(link_data.url) 261 + 262 + category, target_username = categorizer.categorize_url(link_data.url, username) 263 + 264 + # Add to link dictionary (URL as key, maps to list of atom IDs) 265 + if link_data.url not in link_dict: 266 + link_dict[link_data.url] = [] 267 + if link_data.entry_id not in link_dict[link_data.url]: 268 + link_dict[link_data.url].append(link_data.entry_id) 269 + 270 + # Also add to reverse mapping (atom ID -> list of URLs) 271 + if link_data.entry_id not in reverse_dict: 272 + reverse_dict[link_data.entry_id] = [] 273 + if link_data.url not in reverse_dict[link_data.entry_id]: 274 + reverse_dict[link_data.entry_id].append(link_data.url) 275 + 276 + # Add category info to link data for categories tracking 277 + link_info = link_data.to_dict() 278 + link_info["category"] = category 279 + link_info["target_username"] = target_username 280 + 281 + all_links.append(link_info) 282 + link_categories[category].append(link_info) 283 + 284 + progress.advance(processing_task) 285 + 286 + if verbose and entry_links: 287 + console.print(f" Found {len(entry_links)} links in {username}:{entry.title[:50]}...") 288 + 289 + # Determine output path 290 + if output_file: 291 + output_path = output_file 292 + else: 293 + output_path = config.git_store / "links.json" 294 + 295 + # Save all extracted links (not just filtered ones) 296 + if verbose: 297 + console.print("Preparing output data...") 298 + 299 + # Build a set of all URLs that correspond to posts in the git database 300 + registered_urls = set() 301 + 302 + # Get all entries from all users and build URL mappings 303 + for username in users: 304 + entries = git_store.list_entries(username) 305 + user_metadata = index.users[username] 306 + 307 + for entry in entries: 308 + # Try to match entry URLs with extracted links 309 + if hasattr(entry, 'link') and entry.link: 310 + registered_urls.add(str(entry.link)) 311 + 312 + # Also check entry alternate links if they exist 313 + if hasattr(entry, 'links') and entry.links: 314 + for link in entry.links: 315 + if hasattr(link, 'href') and link.href: 316 + registered_urls.add(str(link.href)) 317 + 318 + # Build unified structure with metadata 319 + unified_links = {} 320 + reverse_mapping = {} 321 + 322 + for url, entry_ids in link_dict.items(): 323 + unified_links[url] = { 324 + "referencing_entries": entry_ids 325 + } 326 + 327 + # Find target username if this is a tracked post 328 + if url in registered_urls: 329 + for username in users: 330 + user_domains_set = {domain for domain in user_domains.get(username, [])} 331 + if any(domain in url for domain in user_domains_set): 332 + unified_links[url]["target_username"] = username 333 + break 334 + 335 + # Build reverse mapping 336 + for entry_id in entry_ids: 337 + if entry_id not in reverse_mapping: 338 + reverse_mapping[entry_id] = [] 339 + if url not in reverse_mapping[entry_id]: 340 + reverse_mapping[entry_id].append(url) 341 + 342 + # Create unified output data 343 + output_data = { 344 + "links": unified_links, 345 + "reverse_mapping": reverse_mapping, 346 + "user_domains": {k: list(v) for k, v in user_domains.items()} 347 + } 348 + 349 + if verbose: 350 + console.print(f"Found {len(registered_urls)} registered post URLs") 351 + console.print(f"Found {len(link_dict)} total links, {sum(1 for link in unified_links.values() if 'target_username' in link)} tracked posts") 352 + 353 + # Save unified data 354 + with open(output_path, "w") as f: 355 + json.dump(output_data, f, indent=2, default=str) 356 + 357 + # Show summary 358 + if not get_tsv_mode(): 359 + console.print("\n[green]✓ Links extraction completed successfully[/green]") 360 + 361 + # Create summary table or TSV output 362 + if get_tsv_mode(): 363 + print("Category\tCount\tDescription") 364 + print(f"Internal\t{len(link_categories['internal'])}\tLinks to same user's domain") 365 + print(f"User\t{len(link_categories['user'])}\tLinks to other tracked users") 366 + print(f"Unknown\t{len(link_categories['unknown'])}\tLinks to external sites") 367 + print(f"Total Extracted\t{len(all_links)}\tAll extracted links") 368 + print(f"Saved to Output\t{len(output_data['links'])}\tLinks saved to output file") 369 + print(f"Cross-references\t{sum(1 for link in unified_links.values() if 'target_username' in link)}\tLinks to registered posts only") 370 + else: 371 + table = Table(title="Links Summary") 372 + table.add_column("Category", style="cyan") 373 + table.add_column("Count", style="green") 374 + table.add_column("Description", style="white") 375 + 376 + table.add_row("Internal", str(len(link_categories["internal"])), "Links to same user's domain") 377 + table.add_row("User", str(len(link_categories["user"])), "Links to other tracked users") 378 + table.add_row("Unknown", str(len(link_categories["unknown"])), "Links to external sites") 379 + table.add_row("Total Extracted", str(len(all_links)), "All extracted links") 380 + table.add_row("Saved to Output", str(len(output_data['links'])), "Links saved to output file") 381 + table.add_row("Cross-references", str(sum(1 for link in unified_links.values() if 'target_username' in link)), "Links to registered posts only") 382 + 383 + console.print(table) 384 + 385 + # Show user links if verbose 386 + if verbose and link_categories["user"]: 387 + if get_tsv_mode(): 388 + print("User Link Source\tUser Link Target\tLink Count") 389 + user_link_counts = {} 390 + 391 + for link in link_categories["user"]: 392 + key = f"{link['username']} -> {link['target_username']}" 393 + user_link_counts[key] = user_link_counts.get(key, 0) + 1 394 + 395 + for link_pair, count in sorted(user_link_counts.items(), key=lambda x: x[1], reverse=True)[:10]: 396 + source, target = link_pair.split(" -> ") 397 + print(f"{source}\t{target}\t{count}") 398 + else: 399 + console.print("\n[bold]User-to-user links:[/bold]") 400 + user_link_counts = {} 401 + 402 + for link in link_categories["user"]: 403 + key = f"{link['username']} -> {link['target_username']}" 404 + user_link_counts[key] = user_link_counts.get(key, 0) + 1 405 + 406 + for link_pair, count in sorted(user_link_counts.items(), key=lambda x: x[1], reverse=True)[:10]: 407 + console.print(f" {link_pair}: {count} links") 408 + 409 + if not get_tsv_mode(): 410 + console.print(f"\nUnified links data saved to: {output_path}") 411 + 412 + except Exception as e: 413 + console.print(f"[red]Error extracting links: {e}[/red]") 414 + if verbose: 415 + console.print_exception() 416 + raise typer.Exit(1)

+11 -11

src/thicket/cli/commands/list_cmd.py

··· 11 11 from ..main import app 12 12 from ..utils import ( 13 13 console, 14 - get_tsv_mode, 15 14 load_config, 16 - print_entries_tsv, 17 15 print_error, 16 + print_feeds_table, 18 17 print_feeds_table_from_git, 19 18 print_info, 19 + print_users_table, 20 20 print_users_table_from_git, 21 + print_entries_tsv, 22 + get_tsv_mode, 21 23 ) 22 24 23 25 ··· 58 60 """List all users.""" 59 61 index = git_store._load_index() 60 62 users = list(index.users.values()) 61 - 63 + 62 64 if not users: 63 65 print_info("No users configured") 64 66 return ··· 81 83 print_feeds_table_from_git(git_store, username) 82 84 83 85 84 - def list_entries( 85 - git_store: GitStore, username: Optional[str] = None, limit: Optional[int] = None 86 - ) -> None: 86 + def list_entries(git_store: GitStore, username: Optional[str] = None, limit: Optional[int] = None) -> None: 87 87 """List entries, optionally filtered by user.""" 88 88 89 89 if username: ··· 123 123 """Clean HTML content for display in table.""" 124 124 if not content: 125 125 return "" 126 - 126 + 127 127 # Remove HTML tags 128 - clean_text = re.sub(r"<[^>]+>", " ", content) 128 + clean_text = re.sub(r'<[^>]+>', ' ', content) 129 129 # Replace multiple whitespace with single space 130 - clean_text = re.sub(r"\s+", " ", clean_text) 130 + clean_text = re.sub(r'\s+', ' ', clean_text) 131 131 # Strip and limit length 132 132 clean_text = clean_text.strip() 133 133 if len(clean_text) > 100: 134 134 clean_text = clean_text[:97] + "..." 135 - 135 + 136 136 return clean_text 137 137 138 138 ··· 141 141 if get_tsv_mode(): 142 142 print_entries_tsv(entries_by_user, usernames) 143 143 return 144 - 144 + 145 145 table = Table(title="Feed Entries") 146 146 table.add_column("User", style="cyan", no_wrap=True) 147 147 table.add_column("Title", style="bold")

-301

src/thicket/cli/commands/search.py

··· 1 - """Search command for thicket CLI.""" 2 - 3 - import logging 4 - from pathlib import Path 5 - from typing import Optional 6 - 7 - import typer 8 - from rich.console import Console 9 - from rich.table import Table 10 - 11 - from ...core.typesense_client import TypesenseClient, TypesenseConfig 12 - from ..main import app 13 - 14 - console = Console() 15 - logger = logging.getLogger(__name__) 16 - 17 - 18 - def _load_typesense_config() -> tuple[Optional[str], Optional[str]]: 19 - """Load Typesense URL and API key from ~/.typesense directory.""" 20 - typesense_dir = Path.home() / ".typesense" 21 - url_file = typesense_dir / "url" 22 - key_file = typesense_dir / "api_key" 23 - 24 - url = None 25 - api_key = None 26 - 27 - try: 28 - if url_file.exists(): 29 - url = url_file.read_text().strip() 30 - except Exception as e: 31 - logger.debug(f"Could not read Typesense URL from {url_file}: {e}") 32 - 33 - try: 34 - if key_file.exists(): 35 - api_key = key_file.read_text().strip() 36 - except Exception as e: 37 - logger.debug(f"Could not read Typesense API key from {key_file}: {e}") 38 - 39 - return url, api_key 40 - 41 - 42 - @app.command("search") 43 - def search_command( 44 - query: str = typer.Argument(..., help="Search query"), 45 - typesense_url: Optional[str] = typer.Option( 46 - None, 47 - "--typesense-url", 48 - "-u", 49 - help="Typesense server URL (e.g., http://localhost:8108). Defaults to ~/.typesense/url", 50 - ), 51 - api_key: Optional[str] = typer.Option( 52 - None, 53 - "--api-key", 54 - "-k", 55 - help="Typesense API key. Defaults to ~/.typesense/api_key", 56 - hide_input=True, 57 - ), 58 - collection_name: str = typer.Option( 59 - "thicket", 60 - "--collection", 61 - "-c", 62 - help="Typesense collection name", 63 - ), 64 - config_path: Optional[str] = typer.Option( 65 - None, 66 - "--config", 67 - "-C", 68 - help="Path to thicket configuration file", 69 - ), 70 - limit: int = typer.Option( 71 - 20, 72 - "--limit", 73 - "-l", 74 - help="Maximum number of results to display", 75 - ), 76 - user: Optional[str] = typer.Option( 77 - None, 78 - "--user", 79 - help="Filter results by specific user", 80 - ), 81 - timeout: int = typer.Option( 82 - 10, 83 - "--timeout", 84 - "-t", 85 - help="Connection timeout in seconds", 86 - ), 87 - raw: bool = typer.Option( 88 - False, 89 - "--raw", 90 - help="Display raw JSON output instead of formatted table", 91 - ), 92 - ) -> None: 93 - """Search thicket entries using Typesense full-text and semantic search. 94 - 95 - This command searches through all entries in the Typesense collection 96 - using the provided query. The search covers entry titles, content, 97 - summaries, user information, and metadata. 98 - 99 - Examples: 100 - 101 - # Basic search 102 - thicket search "machine learning" 103 - 104 - # Search with user filter 105 - thicket search "python programming" --user avsm 106 - 107 - # Limit results 108 - thicket search "web development" --limit 10 109 - 110 - # Get raw JSON output 111 - thicket search "database" --raw 112 - """ 113 - try: 114 - # Load Typesense configuration from defaults if not provided 115 - default_url, default_api_key = _load_typesense_config() 116 - 117 - # Use provided values or defaults 118 - final_url = typesense_url or default_url 119 - final_api_key = api_key or default_api_key 120 - 121 - # Check that we have required configuration 122 - if not final_url: 123 - console.print("[red]Error: Typesense URL is required[/red]") 124 - console.print( 125 - "Either provide --typesense-url or create ~/.typesense/url file" 126 - ) 127 - raise typer.Exit(1) 128 - 129 - if not final_api_key: 130 - console.print("[red]Error: Typesense API key is required[/red]") 131 - console.print( 132 - "Either provide --api-key or create ~/.typesense/api_key file" 133 - ) 134 - raise typer.Exit(1) 135 - 136 - # Create Typesense configuration 137 - typesense_config = TypesenseConfig.from_url( 138 - final_url, final_api_key, collection_name 139 - ) 140 - typesense_config.connection_timeout = timeout 141 - 142 - console.print("[bold blue]Searching thicket entries[/bold blue]") 143 - console.print(f"Query: [cyan]{query}[/cyan]") 144 - if user: 145 - console.print(f"User filter: [yellow]{user}[/yellow]") 146 - 147 - # Initialize Typesense client 148 - typesense_client = TypesenseClient(typesense_config) 149 - 150 - # Prepare search parameters 151 - search_params = { 152 - "per_page": limit, 153 - } 154 - 155 - # Add user filter if specified 156 - if user: 157 - search_params["filter_by"] = f"username:{user}" 158 - 159 - # Perform search 160 - try: 161 - results = typesense_client.search(query, search_params) 162 - 163 - if raw: 164 - import json 165 - 166 - console.print(json.dumps(results, indent=2)) 167 - return 168 - 169 - # Display results 170 - _display_search_results(results, query) 171 - 172 - except Exception as e: 173 - console.print(f"[red]❌ Search failed: {e}[/red]") 174 - raise typer.Exit(1) from e 175 - 176 - except Exception as e: 177 - logger.error(f"Search failed: {e}") 178 - console.print(f"[red]Error: {e}[/red]") 179 - raise typer.Exit(1) from e 180 - 181 - 182 - def _display_search_results(results: dict, query: str) -> None: 183 - """Display search results in a formatted table.""" 184 - hits = results.get("hits", []) 185 - found = results.get("found", 0) 186 - search_time = results.get("search_time_ms", 0) 187 - 188 - if not hits: 189 - console.print("\n[yellow]No results found.[/yellow]") 190 - return 191 - 192 - console.print(f"\n[green]Found {found} results in {search_time}ms[/green]") 193 - 194 - table = Table(title=f"Search Results for '{query}'", show_lines=True) 195 - table.add_column("Score", style="green", width=8, no_wrap=True) 196 - table.add_column("User", style="cyan", width=15, no_wrap=True) 197 - table.add_column("Title", style="bold", width=45) 198 - table.add_column("Updated", style="blue", width=12, no_wrap=True) 199 - table.add_column("Summary", style="dim", width=50) 200 - 201 - for hit in hits: 202 - doc = hit["document"] 203 - 204 - # Format score 205 - score = f"{hit.get('text_match', 0):.2f}" 206 - 207 - # Format user 208 - user_display = doc.get("user_display_name", doc.get("username", "Unknown")) 209 - if len(user_display) > 12: 210 - user_display = user_display[:9] + "..." 211 - 212 - # Format title 213 - title = doc.get("title", "Untitled") 214 - if len(title) > 40: 215 - title = title[:37] + "..." 216 - 217 - # Format date 218 - updated_timestamp = doc.get("updated", 0) 219 - if updated_timestamp: 220 - from datetime import datetime 221 - 222 - updated_date = datetime.fromtimestamp(updated_timestamp) 223 - updated_str = updated_date.strftime("%Y-%m-%d") 224 - else: 225 - updated_str = "Unknown" 226 - 227 - # Format summary 228 - summary = doc.get("summary") or doc.get("content", "") 229 - if summary: 230 - # Remove HTML tags and truncate 231 - import re 232 - 233 - summary = re.sub(r"<[^>]+>", "", summary) 234 - summary = summary.strip() 235 - if len(summary) > 60: 236 - summary = summary[:57] + "..." 237 - else: 238 - summary = "" 239 - 240 - table.add_row(score, user_display, title, updated_str, summary) 241 - 242 - console.print(table) 243 - 244 - # Show additional info 245 - console.print(f"\n[dim]Showing {len(hits)} of {found} results[/dim]") 246 - if len(hits) < found: 247 - console.print( 248 - f"[dim]Use --limit to see more results (current limit: {len(hits)})[/dim]" 249 - ) 250 - 251 - 252 - def _display_compact_results(results: dict, query: str) -> None: 253 - """Display search results in a compact format.""" 254 - hits = results.get("hits", []) 255 - found = results.get("found", 0) 256 - 257 - if not hits: 258 - console.print("\n[yellow]No results found.[/yellow]") 259 - return 260 - 261 - console.print(f"\n[green]Found {found} results[/green]\n") 262 - 263 - for i, hit in enumerate(hits, 1): 264 - doc = hit["document"] 265 - score = hit.get("text_match", 0) 266 - 267 - # Header with score and user 268 - user = doc.get("user_display_name", doc.get("username", "Unknown")) 269 - console.print( 270 - f"[green]{i:2d}.[/green] [cyan]{user}[/cyan] [dim](score: {score:.2f})[/dim]" 271 - ) 272 - 273 - # Title 274 - title = doc.get("title", "Untitled") 275 - console.print(f" [bold]{title}[/bold]") 276 - 277 - # Date and link 278 - updated_timestamp = doc.get("updated", 0) 279 - if updated_timestamp: 280 - from datetime import datetime 281 - 282 - updated_date = datetime.fromtimestamp(updated_timestamp) 283 - updated_str = updated_date.strftime("%Y-%m-%d %H:%M") 284 - else: 285 - updated_str = "Unknown date" 286 - 287 - link = doc.get("link", "") 288 - console.print(f" [blue]{updated_str}[/blue] - [link={link}]{link}[/link]") 289 - 290 - # Summary 291 - summary = doc.get("summary") or doc.get("content", "") 292 - if summary: 293 - import re 294 - 295 - summary = re.sub(r"<[^>]+>", "", summary) 296 - summary = summary.strip() 297 - if len(summary) > 150: 298 - summary = summary[:147] + "..." 299 - console.print(f" [dim]{summary}[/dim]") 300 - 301 - console.print() # Empty line between results

+75 -151

src/thicket/cli/commands/sync.py

··· 5 5 from typing import Optional 6 6 7 7 import typer 8 - from pydantic import HttpUrl 9 - from rich.progress import track 8 + from rich.progress import Progress, SpinnerColumn, TextColumn 10 9 11 - from ...core.feed_parser import FeedParser 12 - from ...core.git_store import GitStore 13 - from ...core.opml_generator import OPMLGenerator 14 - from ..main import app 15 - from ..utils import ( 16 - load_config, 17 - print_error, 18 - print_info, 19 - print_success, 20 - ) 10 + from ..main import app, console, load_thicket 21 11 22 12 23 13 @app.command() 24 14 def sync( 25 - all_users: bool = typer.Option( 26 - False, "--all", "-a", help="Sync all users and feeds" 27 - ), 28 15 user: Optional[str] = typer.Option( 29 - None, "--user", "-u", help="Sync specific user only" 16 + None, "--user", "-u", help="Sync specific user only (default: all users)" 30 17 ), 31 18 config_file: Optional[Path] = typer.Option( 32 - Path("thicket.yaml"), "--config", help="Configuration file path" 19 + None, "--config", help="Configuration file path" 33 20 ), 34 - dry_run: bool = typer.Option( 35 - False, "--dry-run", help="Show what would be synced without making changes" 21 + commit: bool = typer.Option( 22 + True, "--commit/--no-commit", help="Commit changes after sync" 36 23 ), 37 24 ) -> None: 38 25 """Sync feeds and store entries in Git repository.""" 39 - 40 - # Load configuration 41 - config = load_config(config_file) 42 - 43 - # Initialize Git store 44 - git_store = GitStore(config.git_store) 45 - 46 - # Determine which users to sync from git repository 47 - users_to_sync = [] 48 - if all_users: 49 - index = git_store._load_index() 50 - users_to_sync = list(index.users.values()) 51 - elif user: 52 - user_metadata = git_store.get_user(user) 53 - if not user_metadata: 54 - print_error(f"User '{user}' not found in git repository") 55 - raise typer.Exit(1) 56 - users_to_sync = [user_metadata] 57 - else: 58 - print_error("Specify --all to sync all users or --user to sync a specific user") 59 - raise typer.Exit(1) 60 - 61 - if not users_to_sync: 62 - print_info("No users configured to sync") 63 - return 64 - 65 - # Sync each user 66 - total_new_entries = 0 67 - total_updated_entries = 0 68 - 69 - for user_metadata in users_to_sync: 70 - print_info(f"Syncing user: {user_metadata.username}") 71 - 72 - user_new_entries = 0 73 - user_updated_entries = 0 74 - 75 - # Sync each feed for the user 76 - for feed_url in track( 77 - user_metadata.feeds, description=f"Syncing {user_metadata.username}'s feeds" 78 - ): 79 - try: 80 - new_entries, updated_entries = asyncio.run( 81 - sync_feed(git_store, user_metadata.username, feed_url, dry_run) 82 - ) 83 - user_new_entries += new_entries 84 - user_updated_entries += updated_entries 85 - 86 - except Exception as e: 87 - print_error(f"Failed to sync feed {feed_url}: {e}") 88 - continue 89 - 90 - print_info( 91 - f"User {user_metadata.username}: {user_new_entries} new, {user_updated_entries} updated" 92 - ) 93 - total_new_entries += user_new_entries 94 - total_updated_entries += user_updated_entries 95 - 96 - # Commit changes if not dry run 97 - if not dry_run and (total_new_entries > 0 or total_updated_entries > 0): 98 - commit_message = f"Sync feeds: {total_new_entries} new entries, {total_updated_entries} updated" 99 - git_store.commit_changes(commit_message) 100 - print_success(f"Committed changes: {commit_message}") 101 - 102 - # Generate OPML file with all feeds 103 - if not dry_run: 104 - try: 105 - opml_generator = OPMLGenerator() 106 - index = git_store._load_index() 107 - opml_path = config.git_store / "index.opml" 108 - 109 - opml_generator.generate_opml( 110 - users=index.users, 111 - title="Thicket Feed Collection", 112 - output_path=opml_path, 113 - ) 114 - print_info(f"Generated OPML file: {opml_path}") 115 - 116 - except Exception as e: 117 - print_error(f"Failed to generate OPML file: {e}") 118 - 119 - # Summary 120 - if dry_run: 121 - print_info( 122 - f"Dry run complete: would sync {total_new_entries} new entries, {total_updated_entries} updated" 123 - ) 124 - else: 125 - print_success( 126 - f"Sync complete: {total_new_entries} new entries, {total_updated_entries} updated" 127 - ) 128 - 129 - 130 - async def sync_feed( 131 - git_store: GitStore, username: str, feed_url: str, dry_run: bool 132 - ) -> tuple[int, int]: 133 - """Sync a single feed for a user.""" 134 - 135 - parser = FeedParser() 136 - 26 + 137 27 try: 138 - # Fetch and parse feed 139 - validated_feed_url = HttpUrl(feed_url) 140 - content = await parser.fetch_feed(validated_feed_url) 141 - metadata, entries = parser.parse_feed(content, validated_feed_url) 142 - 143 - new_entries = 0 144 - updated_entries = 0 145 - 146 - # Process each entry 147 - for entry in entries: 148 - try: 149 - # Check if entry already exists 150 - existing_entry = git_store.get_entry(username, entry.id) 151 - 152 - if existing_entry: 153 - # Check if entry has been updated 154 - if existing_entry.updated != entry.updated: 155 - if not dry_run: 156 - git_store.store_entry(username, entry) 157 - updated_entries += 1 158 - else: 159 - # New entry 160 - if not dry_run: 161 - git_store.store_entry(username, entry) 162 - new_entries += 1 163 - 164 - except Exception as e: 165 - print_error(f"Failed to process entry {entry.id}: {e}") 166 - continue 167 - 168 - return new_entries, updated_entries 169 - 28 + # Load Thicket instance 29 + thicket = load_thicket(config_file) 30 + 31 + # Progress callback for tracking 32 + current_task = None 33 + 34 + def progress_callback(message: str, current: int = 0, total: int = 0): 35 + nonlocal current_task 36 + current_task = message 37 + if total > 0: 38 + console.print(f"[blue]Progress:[/blue] {message} ({current}/{total})") 39 + else: 40 + console.print(f"[blue]Info:[/blue] {message}") 41 + 42 + # Run sync with progress 43 + with Progress( 44 + SpinnerColumn(), 45 + TextColumn("[progress.description]{task.description}"), 46 + console=console, 47 + transient=True, 48 + ) as progress: 49 + task = progress.add_task("Syncing feeds...", total=None) 50 + 51 + # Perform sync 52 + results = asyncio.run(thicket.sync_feeds(user, progress_callback)) 53 + 54 + progress.remove_task(task) 55 + 56 + # Process results 57 + total_new = 0 58 + total_processed = 0 59 + errors = [] 60 + 61 + if isinstance(results, dict): 62 + for username, user_results in results.items(): 63 + if 'error' in user_results: 64 + errors.append(f"{username}: {user_results['error']}") 65 + continue 66 + 67 + total_new += user_results.get('new_entries', 0) 68 + total_processed += user_results.get('feeds_processed', 0) 69 + 70 + console.print(f"[green]✓[/green] {username}: {user_results.get('new_entries', 0)} new entries from {user_results.get('feeds_processed', 0)} feeds") 71 + 72 + # Show any feed-specific errors 73 + for error in user_results.get('errors', []): 74 + console.print(f" [yellow]Warning:[/yellow] {error}") 75 + 76 + # Show errors 77 + for error in errors: 78 + console.print(f"[red]Error:[/red] {error}") 79 + 80 + # Commit changes if requested 81 + if commit and total_new > 0: 82 + commit_message = f"Sync feeds: {total_new} new entries from {total_processed} feeds" 83 + if thicket.commit_changes(commit_message): 84 + console.print(f"[green]✓[/green] Committed: {commit_message}") 85 + else: 86 + console.print("[red]✗[/red] Failed to commit changes") 87 + 88 + # Summary 89 + if total_new > 0: 90 + console.print(f"\n[green]Sync complete:[/green] {total_new} new entries processed") 91 + else: 92 + console.print("\n[blue]Sync complete:[/blue] No new entries found") 93 + 170 94 except Exception as e: 171 - print_error(f"Failed to sync feed {feed_url}: {e}") 172 - return 0, 0 95 + console.print(f"[red]Error:[/red] {str(e)}") 96 + raise typer.Exit(1)

-323

src/thicket/cli/commands/upload.py

··· 1 - """Upload command for thicket CLI.""" 2 - 3 - import logging 4 - from pathlib import Path 5 - from typing import Optional 6 - 7 - import typer 8 - from rich.console import Console 9 - from rich.progress import Progress, SpinnerColumn, TextColumn 10 - 11 - from ...core.git_store import GitStore 12 - from ...core.typesense_client import TypesenseClient, TypesenseConfig 13 - from ...models.config import ThicketConfig 14 - from ..main import app 15 - from ..utils import load_config 16 - 17 - console = Console() 18 - logger = logging.getLogger(__name__) 19 - 20 - 21 - def _load_typesense_config() -> tuple[Optional[str], Optional[str]]: 22 - """Load Typesense URL and API key from ~/.typesense directory.""" 23 - typesense_dir = Path.home() / ".typesense" 24 - url_file = typesense_dir / "url" 25 - key_file = typesense_dir / "api_key" 26 - 27 - url = None 28 - api_key = None 29 - 30 - try: 31 - if url_file.exists(): 32 - url = url_file.read_text().strip() 33 - except Exception as e: 34 - logger.debug(f"Could not read Typesense URL from {url_file}: {e}") 35 - 36 - try: 37 - if key_file.exists(): 38 - api_key = key_file.read_text().strip() 39 - except Exception as e: 40 - logger.debug(f"Could not read Typesense API key from {key_file}: {e}") 41 - 42 - return url, api_key 43 - 44 - 45 - def _save_typesense_config( 46 - url: Optional[str] = None, api_key: Optional[str] = None 47 - ) -> None: 48 - """Save Typesense URL and API key to ~/.typesense directory.""" 49 - typesense_dir = Path.home() / ".typesense" 50 - typesense_dir.mkdir(exist_ok=True, mode=0o700) # Secure permissions 51 - 52 - if url: 53 - url_file = typesense_dir / "url" 54 - url_file.write_text(url) 55 - url_file.chmod(0o600) 56 - 57 - if api_key: 58 - key_file = typesense_dir / "api_key" 59 - key_file.write_text(api_key) 60 - key_file.chmod(0o600) # Keep API key secure 61 - 62 - 63 - @app.command("upload") 64 - def upload_command( 65 - typesense_url: Optional[str] = typer.Option( 66 - None, 67 - "--typesense-url", 68 - "-u", 69 - help="Typesense server URL (e.g., http://localhost:8108). Defaults to ~/.typesense/url", 70 - ), 71 - api_key: Optional[str] = typer.Option( 72 - None, 73 - "--api-key", 74 - "-k", 75 - help="Typesense API key. Defaults to ~/.typesense/api_key", 76 - hide_input=True, 77 - ), 78 - collection_name: str = typer.Option( 79 - "thicket_entries", 80 - "--collection", 81 - "-c", 82 - help="Typesense collection name", 83 - ), 84 - config_path: Optional[str] = typer.Option( 85 - None, 86 - "--config", 87 - "-C", 88 - help="Path to thicket configuration file", 89 - ), 90 - git_store_path: Optional[str] = typer.Option( 91 - None, 92 - "--git-store", 93 - "-g", 94 - help="Path to Git store (overrides config)", 95 - ), 96 - timeout: int = typer.Option( 97 - 10, 98 - "--timeout", 99 - "-t", 100 - help="Connection timeout in seconds", 101 - ), 102 - dry_run: bool = typer.Option( 103 - False, 104 - "--dry-run", 105 - help="Show what would be uploaded without actually uploading", 106 - ), 107 - ) -> None: 108 - """Upload thicket entries to a Typesense search engine. 109 - 110 - This command uploads all entries from the Git store to a Typesense server 111 - for full-text and semantic search capabilities. The uploaded data includes 112 - entry content, metadata, user information, and searchable text fields 113 - optimized for embedding-based queries. 114 - 115 - Configuration defaults can be stored in ~/.typesense/ directory: 116 - - URL in ~/.typesense/url 117 - - API key in ~/.typesense/api_key 118 - 119 - Examples: 120 - 121 - # Upload using saved defaults (first run will save config) 122 - thicket upload -u http://localhost:8108 -k your-api-key 123 - 124 - # Subsequent runs can omit URL and key if saved 125 - thicket upload 126 - 127 - # Upload to remote server with custom collection name 128 - thicket upload -u https://search.example.com -k api-key -c my_blog_entries 129 - 130 - # Dry run to see what would be uploaded 131 - thicket upload --dry-run 132 - """ 133 - try: 134 - # Load Typesense configuration from defaults if not provided 135 - default_url, default_api_key = _load_typesense_config() 136 - 137 - # Use provided values or defaults 138 - final_url = typesense_url or default_url 139 - final_api_key = api_key or default_api_key 140 - 141 - # Check that we have required configuration 142 - if not final_url: 143 - console.print("[red]Error: Typesense URL is required[/red]") 144 - console.print( 145 - "Either provide --typesense-url or create ~/.typesense/url file" 146 - ) 147 - raise typer.Exit(1) 148 - 149 - if not final_api_key: 150 - console.print("[red]Error: Typesense API key is required[/red]") 151 - console.print( 152 - "Either provide --api-key or create ~/.typesense/api_key file" 153 - ) 154 - raise typer.Exit(1) 155 - 156 - # Save configuration if provided via command line (for future use) 157 - if typesense_url or api_key: 158 - _save_typesense_config(typesense_url, api_key) 159 - 160 - # Load thicket configuration 161 - config_path_obj = Path(config_path) if config_path else None 162 - config = load_config(config_path_obj) 163 - 164 - # Override git store path if provided 165 - if git_store_path: 166 - config.git_store = Path(git_store_path) 167 - 168 - console.print("[bold blue]Thicket Typesense Upload[/bold blue]") 169 - console.print(f"Git store: {config.git_store}") 170 - console.print(f"Typesense URL: {final_url}") 171 - 172 - # Show where config is loaded from 173 - if not typesense_url and default_url: 174 - console.print("[dim] (URL loaded from ~/.typesense/url)[/dim]") 175 - if not api_key and default_api_key: 176 - console.print("[dim] (API key loaded from ~/.typesense/api_key)[/dim]") 177 - 178 - console.print(f"Collection: {collection_name}") 179 - 180 - if dry_run: 181 - console.print("[yellow]DRY RUN MODE - No data will be uploaded[/yellow]") 182 - 183 - # Initialize Git store 184 - git_store = GitStore(config.git_store) 185 - if not git_store.repo or not config.git_store.exists(): 186 - console.print("[red]Error: Git store is not valid or not initialized[/red]") 187 - console.print("Run 'thicket init' first to set up the Git store.") 188 - raise typer.Exit(1) 189 - 190 - # Create Typesense configuration 191 - typesense_config = TypesenseConfig.from_url( 192 - final_url, final_api_key, collection_name 193 - ) 194 - typesense_config.connection_timeout = timeout 195 - 196 - if dry_run: 197 - _dry_run_upload(git_store, config, typesense_config) 198 - else: 199 - _perform_upload(git_store, config, typesense_config) 200 - 201 - except Exception as e: 202 - logger.error(f"Upload failed: {e}") 203 - console.print(f"[red]Error: {e}[/red]") 204 - raise typer.Exit(1) from e 205 - 206 - 207 - def _dry_run_upload( 208 - git_store: GitStore, config: ThicketConfig, typesense_config: TypesenseConfig 209 - ) -> None: 210 - """Perform a dry run showing what would be uploaded.""" 211 - console.print("\n[bold]Dry run analysis:[/bold]") 212 - 213 - index = git_store._load_index() 214 - total_entries = 0 215 - 216 - for username, user_metadata in index.users.items(): 217 - try: 218 - user_dir = git_store.repo_path / user_metadata.directory 219 - if not user_dir.exists(): 220 - console.print(f" ⚠️ User {username}: Directory not found") 221 - continue 222 - 223 - entry_files = list(user_dir.glob("*.json")) 224 - total_entries += len(entry_files) 225 - console.print( 226 - f" ✅ User {username}: {len(entry_files)} entries would be uploaded" 227 - ) 228 - except Exception as e: 229 - console.print(f" ❌ User {username}: Error loading entries - {e}") 230 - 231 - console.print("\n[bold]Summary:[/bold]") 232 - console.print(f" • Total users: {len(index.users)}") 233 - console.print(f" • Total entries to upload: {total_entries}") 234 - console.print(f" • Target collection: {typesense_config.collection_name}") 235 - console.print( 236 - f" • Typesense server: {typesense_config.protocol}://{typesense_config.host}:{typesense_config.port}" 237 - ) 238 - 239 - if total_entries > 0: 240 - console.print("\n[green]Ready to upload! Remove --dry-run to proceed.[/green]") 241 - else: 242 - console.print("\n[yellow]No entries found to upload.[/yellow]") 243 - 244 - 245 - def _perform_upload( 246 - git_store: GitStore, config: ThicketConfig, typesense_config: TypesenseConfig 247 - ) -> None: 248 - """Perform the actual upload to Typesense.""" 249 - with Progress( 250 - SpinnerColumn(), 251 - TextColumn("[progress.description]{task.description}"), 252 - console=console, 253 - ) as progress: 254 - # Test connection 255 - progress.add_task("Testing Typesense connection...", total=None) 256 - 257 - try: 258 - typesense_client = TypesenseClient(typesense_config) 259 - # Test connection by attempting to list collections 260 - typesense_client.client.collections.retrieve() 261 - progress.stop() 262 - console.print("[green]✅ Connected to Typesense server[/green]") 263 - except Exception as e: 264 - progress.stop() 265 - console.print(f"[red]❌ Failed to connect to Typesense: {e}[/red]") 266 - raise typer.Exit(1) from e 267 - 268 - # Perform upload 269 - with Progress( 270 - SpinnerColumn(), 271 - TextColumn("[progress.description]{task.description}"), 272 - console=console, 273 - ) as upload_progress: 274 - upload_progress.add_task("Uploading entries to Typesense...", total=None) 275 - 276 - try: 277 - result = typesense_client.upload_from_git_store(git_store, config) 278 - upload_progress.stop() 279 - 280 - # Parse results if available 281 - if result: 282 - if isinstance(result, list): 283 - # Batch import results 284 - success_count = sum(1 for r in result if r.get("success")) 285 - total_count = len(result) 286 - console.print( 287 - f"[green]✅ Upload completed: {success_count}/{total_count} documents uploaded successfully[/green]" 288 - ) 289 - 290 - # Show any errors 291 - errors = [r for r in result if not r.get("success")] 292 - if errors: 293 - console.print( 294 - f"[yellow]⚠️ {len(errors)} documents had errors[/yellow]" 295 - ) 296 - for i, error in enumerate( 297 - errors[:5] 298 - ): # Show first 5 errors 299 - console.print(f" Error {i + 1}: {error}") 300 - if len(errors) > 5: 301 - console.print( 302 - f" ... and {len(errors) - 5} more errors" 303 - ) 304 - else: 305 - console.print("[green]✅ Upload completed successfully[/green]") 306 - else: 307 - console.print( 308 - "[yellow]⚠️ Upload completed but no result data available[/yellow]" 309 - ) 310 - 311 - console.print("\n[bold]Collection information:[/bold]") 312 - console.print( 313 - f" • Server: {typesense_config.protocol}://{typesense_config.host}:{typesense_config.port}" 314 - ) 315 - console.print(f" • Collection: {typesense_config.collection_name}") 316 - console.print( 317 - "\n[dim]You can now search your entries using the Typesense API or dashboard.[/dim]" 318 - ) 319 - 320 - except Exception as e: 321 - upload_progress.stop() 322 - console.print(f"[red]❌ Upload failed: {e}[/red]") 323 - raise typer.Exit(1) from e

-268

src/thicket/cli/commands/zulip.py

··· 1 - """Zulip association management commands for thicket.""" 2 - 3 - from pathlib import Path 4 - from typing import Optional 5 - 6 - import typer 7 - from rich.console import Console 8 - from rich.table import Table 9 - 10 - from ...core.git_store import GitStore 11 - from ..main import app 12 - from ..utils import load_config, print_error, print_info, print_success 13 - 14 - console = Console() 15 - 16 - 17 - @app.command() 18 - def zulip_add( 19 - username: str = typer.Argument(..., help="Username to associate with Zulip"), 20 - server: str = typer.Argument( 21 - ..., help="Zulip server (e.g., yourorg.zulipchat.com)" 22 - ), 23 - user_id: str = typer.Argument(..., help="Zulip user ID or email for @mentions"), 24 - config_file: Path = typer.Option( 25 - Path("thicket.yaml"), 26 - "--config", 27 - "-c", 28 - help="Path to thicket configuration file", 29 - ), 30 - ) -> None: 31 - """Add a Zulip association for a user. 32 - 33 - This associates a thicket user with their Zulip identity, enabling 34 - @mentions when the bot posts their articles. 35 - 36 - Example: 37 - thicket zulip-add alice myorg.zulipchat.com alice@example.com 38 - """ 39 - try: 40 - config = load_config(config_file) 41 - git_store = GitStore(config.git_store) 42 - 43 - # Check if user exists 44 - user = git_store.get_user(username) 45 - if not user: 46 - print_error(f"User '{username}' not found") 47 - raise typer.Exit(1) 48 - 49 - # Add association 50 - if git_store.add_zulip_association(username, server, user_id): 51 - print_success(f"Added Zulip association for {username}: {user_id}@{server}") 52 - git_store.commit_changes(f"Add Zulip association for {username}") 53 - else: 54 - print_info(f"Association already exists for {username}: {user_id}@{server}") 55 - 56 - except Exception as e: 57 - print_error(f"Failed to add Zulip association: {e}") 58 - raise typer.Exit(1) from e 59 - 60 - 61 - @app.command() 62 - def zulip_remove( 63 - username: str = typer.Argument(..., help="Username to remove association from"), 64 - server: str = typer.Argument(..., help="Zulip server"), 65 - user_id: str = typer.Argument(..., help="Zulip user ID or email"), 66 - config_file: Path = typer.Option( 67 - Path("thicket.yaml"), 68 - "--config", 69 - "-c", 70 - help="Path to thicket configuration file", 71 - ), 72 - ) -> None: 73 - """Remove a Zulip association from a user. 74 - 75 - Example: 76 - thicket zulip-remove alice myorg.zulipchat.com alice@example.com 77 - """ 78 - try: 79 - config = load_config(config_file) 80 - git_store = GitStore(config.git_store) 81 - 82 - # Check if user exists 83 - user = git_store.get_user(username) 84 - if not user: 85 - print_error(f"User '{username}' not found") 86 - raise typer.Exit(1) 87 - 88 - # Remove association 89 - if git_store.remove_zulip_association(username, server, user_id): 90 - print_success( 91 - f"Removed Zulip association for {username}: {user_id}@{server}" 92 - ) 93 - git_store.commit_changes(f"Remove Zulip association for {username}") 94 - else: 95 - print_error(f"Association not found for {username}: {user_id}@{server}") 96 - raise typer.Exit(1) 97 - 98 - except Exception as e: 99 - print_error(f"Failed to remove Zulip association: {e}") 100 - raise typer.Exit(1) from e 101 - 102 - 103 - @app.command() 104 - def zulip_list( 105 - username: Optional[str] = typer.Argument( 106 - None, help="Username to list associations for" 107 - ), 108 - config_file: Path = typer.Option( 109 - Path("thicket.yaml"), 110 - "--config", 111 - "-c", 112 - help="Path to thicket configuration file", 113 - ), 114 - ) -> None: 115 - """List Zulip associations for users. 116 - 117 - If no username is provided, lists associations for all users. 118 - 119 - Examples: 120 - thicket zulip-list # List all associations 121 - thicket zulip-list alice # List associations for alice 122 - """ 123 - try: 124 - config = load_config(config_file) 125 - git_store = GitStore(config.git_store) 126 - 127 - # Create table 128 - table = Table(title="Zulip Associations") 129 - table.add_column("Username", style="cyan") 130 - table.add_column("Server", style="green") 131 - table.add_column("User ID", style="yellow") 132 - 133 - if username: 134 - # List for specific user 135 - user = git_store.get_user(username) 136 - if not user: 137 - print_error(f"User '{username}' not found") 138 - raise typer.Exit(1) 139 - 140 - if not user.zulip_associations: 141 - print_info(f"No Zulip associations for {username}") 142 - return 143 - 144 - for assoc in user.zulip_associations: 145 - table.add_row(username, assoc.server, assoc.user_id) 146 - else: 147 - # List for all users 148 - index = git_store._load_index() 149 - has_associations = False 150 - 151 - for username, user in index.users.items(): 152 - for assoc in user.zulip_associations: 153 - table.add_row(username, assoc.server, assoc.user_id) 154 - has_associations = True 155 - 156 - if not has_associations: 157 - print_info("No Zulip associations found") 158 - return 159 - 160 - console.print(table) 161 - 162 - except Exception as e: 163 - print_error(f"Failed to list Zulip associations: {e}") 164 - raise typer.Exit(1) from e 165 - 166 - 167 - @app.command() 168 - def zulip_import( 169 - csv_file: Path = typer.Argument(..., help="CSV file with username,server,user_id"), 170 - config_file: Path = typer.Option( 171 - Path("thicket.yaml"), 172 - "--config", 173 - "-c", 174 - help="Path to thicket configuration file", 175 - ), 176 - dry_run: bool = typer.Option( 177 - False, 178 - "--dry-run", 179 - help="Show what would be imported without making changes", 180 - ), 181 - ) -> None: 182 - """Import Zulip associations from a CSV file. 183 - 184 - CSV format (no header): 185 - username,server,user_id 186 - alice,myorg.zulipchat.com,alice@example.com 187 - bob,myorg.zulipchat.com,bob.smith 188 - 189 - Example: 190 - thicket zulip-import associations.csv 191 - """ 192 - import csv 193 - 194 - try: 195 - config = load_config(config_file) 196 - git_store = GitStore(config.git_store) 197 - 198 - if not csv_file.exists(): 199 - print_error(f"CSV file not found: {csv_file}") 200 - raise typer.Exit(1) 201 - 202 - added = 0 203 - skipped = 0 204 - errors = 0 205 - 206 - with open(csv_file) as f: 207 - reader = csv.reader(f) 208 - for row_num, row in enumerate(reader, 1): 209 - if len(row) != 3: 210 - print_error(f"Line {row_num}: Invalid format (expected 3 columns)") 211 - errors += 1 212 - continue 213 - 214 - username, server, user_id = [col.strip() for col in row] 215 - 216 - # Skip empty lines 217 - if not username: 218 - continue 219 - 220 - # Check if user exists 221 - user = git_store.get_user(username) 222 - if not user: 223 - print_error(f"Line {row_num}: User '{username}' not found") 224 - errors += 1 225 - continue 226 - 227 - if dry_run: 228 - # Check if association would be added 229 - exists = any( 230 - a.server == server and a.user_id == user_id 231 - for a in user.zulip_associations 232 - ) 233 - if exists: 234 - print_info( 235 - f"Would skip existing: {username} -> {user_id}@{server}" 236 - ) 237 - skipped += 1 238 - else: 239 - print_info(f"Would add: {username} -> {user_id}@{server}") 240 - added += 1 241 - else: 242 - # Actually add association 243 - if git_store.add_zulip_association(username, server, user_id): 244 - print_success(f"Added: {username} -> {user_id}@{server}") 245 - added += 1 246 - else: 247 - print_info( 248 - f"Skipped existing: {username} -> {user_id}@{server}" 249 - ) 250 - skipped += 1 251 - 252 - # Summary 253 - console.print() 254 - if dry_run: 255 - console.print("[bold]Dry run summary:[/bold]") 256 - console.print(f" Would add: {added}") 257 - else: 258 - console.print("[bold]Import summary:[/bold]") 259 - console.print(f" Added: {added}") 260 - if not dry_run and added > 0: 261 - git_store.commit_changes(f"Import {added} Zulip associations from CSV") 262 - 263 - console.print(f" Skipped: {skipped}") 264 - console.print(f" Errors: {errors}") 265 - 266 - except Exception as e: 267 - print_error(f"Failed to import Zulip associations: {e}") 268 - raise typer.Exit(1) from e

+36 -10

src/thicket/cli/main.py

··· 1 1 """Main CLI application using Typer.""" 2 2 3 + from pathlib import Path 4 + from typing import Optional 5 + 3 6 import typer 4 7 from rich.console import Console 5 8 6 - from .. import __version__ 9 + from .. import __version__, Thicket, ThicketConfig 7 10 8 11 app = typer.Typer( 9 12 name="thicket", ··· 25 28 raise typer.Exit() 26 29 27 30 31 + def load_thicket(config_path: Optional[Path] = None) -> Thicket: 32 + """Load Thicket instance from configuration.""" 33 + if config_path and config_path.exists(): 34 + return Thicket.from_config_file(config_path) 35 + 36 + # Try default locations 37 + default_paths = [ 38 + Path("thicket.yaml"), 39 + Path("thicket.yml"), 40 + Path("thicket.json"), 41 + Path.home() / ".config" / "thicket" / "config.yaml", 42 + Path.home() / ".thicket.yaml", 43 + ] 44 + 45 + for path in default_paths: 46 + if path.exists(): 47 + return Thicket.from_config_file(path) 48 + 49 + # No config found 50 + console.print("[red]Error:[/red] No configuration file found.") 51 + console.print("Use [bold]thicket init[/bold] to create a new configuration or specify --config") 52 + raise typer.Exit(1) 53 + 54 + 55 + def get_config_path() -> Path: 56 + """Get the default configuration path for new configs.""" 57 + config_dir = Path.home() / ".config" / "thicket" 58 + config_dir.mkdir(parents=True, exist_ok=True) 59 + return config_dir / "config.yaml" 60 + 61 + 28 62 @app.callback() 29 63 def main( 30 64 version: bool = typer.Option( ··· 47 81 48 82 49 83 # Import commands to register them 50 - from .commands import ( # noqa: F401, E402 51 - add, 52 - duplicates, 53 - info_cmd, 54 - init, 55 - list_cmd, 56 - sync, 57 - upload, 58 - ) 84 + from .commands import add, duplicates, generate, index_cmd, info_cmd, init, links_cmd, list_cmd, sync 59 85 60 86 if __name__ == "__main__": 61 87 app()

+20 -32

src/thicket/cli/utils.py

··· 8 8 from rich.progress import Progress, SpinnerColumn, TextColumn 9 9 from rich.table import Table 10 10 11 - from ..core.git_store import GitStore 12 11 from ..models import ThicketConfig, UserMetadata 12 + from ..core.git_store import GitStore 13 13 14 14 console = Console() 15 15 ··· 17 17 def get_tsv_mode() -> bool: 18 18 """Get the global TSV mode setting.""" 19 19 from .main import tsv_mode 20 - 21 20 return tsv_mode 22 21 23 22 ··· 38 37 default_config = Path("thicket.yaml") 39 38 if default_config.exists(): 40 39 import yaml 41 - 42 40 with open(default_config) as f: 43 41 config_data = yaml.safe_load(f) 44 42 return ThicketConfig(**config_data) 45 - 43 + 46 44 # Fall back to environment variables 47 45 return ThicketConfig() 48 46 except Exception as e: 49 47 console.print(f"[red]Error loading configuration: {e}[/red]") 50 - console.print( 51 - "[yellow]Run 'thicket init' to create a new configuration.[/yellow]" 52 - ) 48 + console.print("[yellow]Run 'thicket init' to create a new configuration.[/yellow]") 53 49 raise typer.Exit(1) from e 54 50 55 51 ··· 82 78 if get_tsv_mode(): 83 79 print_users_tsv(config) 84 80 return 85 - 81 + 86 82 table = Table(title="Users and Feeds") 87 83 table.add_column("Username", style="cyan", no_wrap=True) 88 84 table.add_column("Display Name", style="magenta") ··· 108 104 if get_tsv_mode(): 109 105 print_feeds_tsv(config, username) 110 106 return 111 - 107 + 112 108 table = Table(title=f"Feeds{f' for {username}' if username else ''}") 113 109 table.add_column("Username", style="cyan", no_wrap=True) 114 110 table.add_column("Feed URL", style="blue") ··· 158 154 if get_tsv_mode(): 159 155 print_users_tsv_from_git(users) 160 156 return 161 - 157 + 162 158 table = Table(title="Users and Feeds") 163 159 table.add_column("Username", style="cyan", no_wrap=True) 164 160 table.add_column("Display Name", style="magenta") ··· 179 175 console.print(table) 180 176 181 177 182 - def print_feeds_table_from_git( 183 - git_store: GitStore, username: Optional[str] = None 184 - ) -> None: 178 + def print_feeds_table_from_git(git_store: GitStore, username: Optional[str] = None) -> None: 185 179 """Print a table of feeds from git repository.""" 186 180 if get_tsv_mode(): 187 181 print_feeds_tsv_from_git(git_store, username) 188 182 return 189 - 183 + 190 184 table = Table(title=f"Feeds{f' for {username}' if username else ''}") 191 185 table.add_column("Username", style="cyan", no_wrap=True) 192 186 table.add_column("Feed URL", style="blue") ··· 215 209 print("Username\tDisplay Name\tEmail\tHomepage\tFeeds") 216 210 for user in config.users: 217 211 feeds_str = ",".join(str(feed) for feed in user.feeds) 218 - print( 219 - f"{user.username}\t{user.display_name or ''}\t{user.email or ''}\t{user.homepage or ''}\t{feeds_str}" 220 - ) 212 + print(f"{user.username}\t{user.display_name or ''}\t{user.email or ''}\t{user.homepage or ''}\t{feeds_str}") 221 213 222 214 223 215 def print_users_tsv_from_git(users: list[UserMetadata]) -> None: ··· 225 217 print("Username\tDisplay Name\tEmail\tHomepage\tFeeds") 226 218 for user in users: 227 219 feeds_str = ",".join(user.feeds) 228 - print( 229 - f"{user.username}\t{user.display_name or ''}\t{user.email or ''}\t{user.homepage or ''}\t{feeds_str}" 230 - ) 220 + print(f"{user.username}\t{user.display_name or ''}\t{user.email or ''}\t{user.homepage or ''}\t{feeds_str}") 231 221 232 222 233 223 def print_feeds_tsv(config: ThicketConfig, username: Optional[str] = None) -> None: ··· 235 225 print("Username\tFeed URL\tStatus") 236 226 users = [config.find_user(username)] if username else config.users 237 227 users = [u for u in users if u is not None] 238 - 228 + 239 229 for user in users: 240 230 for feed in user.feeds: 241 231 print(f"{user.username}\t{feed}\tActive") 242 232 243 233 244 - def print_feeds_tsv_from_git( 245 - git_store: GitStore, username: Optional[str] = None 246 - ) -> None: 234 + def print_feeds_tsv_from_git(git_store: GitStore, username: Optional[str] = None) -> None: 247 235 """Print feeds from git repository in TSV format.""" 248 236 print("Username\tFeed URL\tStatus") 249 - 237 + 250 238 if username: 251 239 user = git_store.get_user(username) 252 240 users = [user] if user else [] 253 241 else: 254 242 index = git_store._load_index() 255 243 users = list(index.users.values()) 256 - 244 + 257 245 for user in users: 258 246 for feed in user.feeds: 259 247 print(f"{user.username}\t{feed}\tActive") ··· 262 250 def print_entries_tsv(entries_by_user: list[list], usernames: list[str]) -> None: 263 251 """Print entries in TSV format.""" 264 252 print("User\tAtom ID\tTitle\tUpdated\tURL") 265 - 253 + 266 254 # Combine all entries with usernames 267 255 all_entries = [] 268 256 for entries, username in zip(entries_by_user, usernames): 269 257 for entry in entries: 270 258 all_entries.append((username, entry)) 271 - 259 + 272 260 # Sort by updated time (newest first) 273 261 all_entries.sort(key=lambda x: x[1].updated, reverse=True) 274 - 262 + 275 263 for username, entry in all_entries: 276 264 # Format updated time 277 265 updated_str = entry.updated.strftime("%Y-%m-%d %H:%M") 278 - 266 + 279 267 # Escape tabs and newlines in title to preserve TSV format 280 - title = entry.title.replace("\t", " ").replace("\n", " ").replace("\r", " ") 281 - 268 + title = entry.title.replace('\t', ' ').replace('\n', ' ').replace('\r', ' ') 269 + 282 270 print(f"{username}\t{entry.id}\t{title}\t{updated_str}\t{entry.link}")

+55 -84

src/thicket/core/feed_parser.py

··· 19 19 """Initialize the feed parser.""" 20 20 self.user_agent = user_agent 21 21 self.allowed_tags = [ 22 - "a", 23 - "abbr", 24 - "acronym", 25 - "b", 26 - "blockquote", 27 - "br", 28 - "code", 29 - "em", 30 - "i", 31 - "li", 32 - "ol", 33 - "p", 34 - "pre", 35 - "strong", 36 - "ul", 37 - "h1", 38 - "h2", 39 - "h3", 40 - "h4", 41 - "h5", 42 - "h6", 43 - "img", 44 - "div", 45 - "span", 22 + "a", "abbr", "acronym", "b", "blockquote", "br", "code", "em", 23 + "i", "li", "ol", "p", "pre", "strong", "ul", "h1", "h2", "h3", 24 + "h4", "h5", "h6", "img", "div", "span", 46 25 ] 47 26 self.allowed_attributes = { 48 27 "a": ["href", "title"], ··· 64 43 response.raise_for_status() 65 44 return response.text 66 45 67 - def parse_feed( 68 - self, content: str, source_url: Optional[HttpUrl] = None 69 - ) -> tuple[FeedMetadata, list[AtomEntry]]: 46 + def parse_feed(self, content: str, source_url: Optional[HttpUrl] = None) -> tuple[FeedMetadata, list[AtomEntry]]: 70 47 """Parse feed content and return metadata and entries.""" 71 48 parsed = feedparser.parse(content) 72 49 ··· 97 74 author_email = None 98 75 author_uri = None 99 76 100 - if hasattr(feed, "author_detail"): 101 - author_name = feed.author_detail.get("name") 102 - author_email = feed.author_detail.get("email") 103 - author_uri = feed.author_detail.get("href") 104 - elif hasattr(feed, "author"): 77 + if hasattr(feed, 'author_detail'): 78 + author_name = feed.author_detail.get('name') 79 + author_email = feed.author_detail.get('email') 80 + author_uri = feed.author_detail.get('href') 81 + elif hasattr(feed, 'author'): 105 82 author_name = feed.author 106 83 107 84 # Parse managing editor for RSS feeds 108 - if not author_email and hasattr(feed, "managingEditor"): 85 + if not author_email and hasattr(feed, 'managingEditor'): 109 86 author_email = feed.managingEditor 110 87 111 88 # Parse feed link 112 89 feed_link = None 113 - if hasattr(feed, "link"): 90 + if hasattr(feed, 'link'): 114 91 try: 115 92 feed_link = HttpUrl(feed.link) 116 93 except ValidationError: ··· 121 98 icon = None 122 99 image_url = None 123 100 124 - if hasattr(feed, "image"): 101 + if hasattr(feed, 'image'): 125 102 try: 126 - image_url = HttpUrl(feed.image.get("href", feed.image.get("url", ""))) 103 + image_url = HttpUrl(feed.image.get('href', feed.image.get('url', ''))) 127 104 except (ValidationError, AttributeError): 128 105 pass 129 106 130 - if hasattr(feed, "icon"): 107 + if hasattr(feed, 'icon'): 131 108 try: 132 109 icon = HttpUrl(feed.icon) 133 110 except ValidationError: 134 111 pass 135 112 136 - if hasattr(feed, "logo"): 113 + if hasattr(feed, 'logo'): 137 114 try: 138 115 logo = HttpUrl(feed.logo) 139 116 except ValidationError: 140 117 pass 141 118 142 119 return FeedMetadata( 143 - title=getattr(feed, "title", None), 120 + title=getattr(feed, 'title', None), 144 121 author_name=author_name, 145 122 author_email=author_email, 146 123 author_uri=HttpUrl(author_uri) if author_uri else None, ··· 148 125 logo=logo, 149 126 icon=icon, 150 127 image_url=image_url, 151 - description=getattr(feed, "description", None), 128 + description=getattr(feed, 'description', None), 152 129 ) 153 130 154 - def _normalize_entry( 155 - self, entry: feedparser.FeedParserDict, source_url: Optional[HttpUrl] = None 156 - ) -> AtomEntry: 131 + def _normalize_entry(self, entry: feedparser.FeedParserDict, source_url: Optional[HttpUrl] = None) -> AtomEntry: 157 132 """Normalize an entry to Atom format.""" 158 133 # Parse timestamps 159 - updated = self._parse_timestamp( 160 - entry.get("updated_parsed") or entry.get("published_parsed") 161 - ) 162 - published = self._parse_timestamp(entry.get("published_parsed")) 134 + updated = self._parse_timestamp(entry.get('updated_parsed') or entry.get('published_parsed')) 135 + published = self._parse_timestamp(entry.get('published_parsed')) 163 136 164 137 # Parse content 165 138 content = self._extract_content(entry) ··· 170 143 171 144 # Parse categories/tags 172 145 categories = [] 173 - if hasattr(entry, "tags"): 174 - categories = [tag.get("term", "") for tag in entry.tags if tag.get("term")] 146 + if hasattr(entry, 'tags'): 147 + categories = [tag.get('term', '') for tag in entry.tags if tag.get('term')] 175 148 176 149 # Sanitize HTML content 177 150 if content: 178 151 content = self._sanitize_html(content) 179 152 180 - summary = entry.get("summary", "") 153 + summary = entry.get('summary', '') 181 154 if summary: 182 155 summary = self._sanitize_html(summary) 183 156 184 157 return AtomEntry( 185 - id=entry.get("id", entry.get("link", "")), 186 - title=entry.get("title", ""), 187 - link=HttpUrl(entry.get("link", "")), 158 + id=entry.get('id', entry.get('link', '')), 159 + title=entry.get('title', ''), 160 + link=HttpUrl(entry.get('link', '')), 188 161 updated=updated, 189 162 published=published, 190 163 summary=summary or None, ··· 192 165 content_type=content_type, 193 166 author=author, 194 167 categories=categories, 195 - rights=entry.get("rights", None), 168 + rights=entry.get('rights', None), 196 169 source=str(source_url) if source_url else None, 197 170 ) 198 171 ··· 205 178 def _extract_content(self, entry: feedparser.FeedParserDict) -> Optional[str]: 206 179 """Extract the best content from an entry.""" 207 180 # Prefer content over summary 208 - if hasattr(entry, "content") and entry.content: 181 + if hasattr(entry, 'content') and entry.content: 209 182 # Find the best content (prefer text/html, then text/plain) 210 183 for content_item in entry.content: 211 - if content_item.get("type") in ["text/html", "html"]: 212 - return content_item.get("value", "") 213 - elif content_item.get("type") in ["text/plain", "text"]: 214 - return content_item.get("value", "") 184 + if content_item.get('type') in ['text/html', 'html']: 185 + return content_item.get('value', '') 186 + elif content_item.get('type') in ['text/plain', 'text']: 187 + return content_item.get('value', '') 215 188 # Fallback to first content item 216 - return entry.content[0].get("value", "") 189 + return entry.content[0].get('value', '') 217 190 218 191 # Fallback to summary 219 - return entry.get("summary", "") 192 + return entry.get('summary', '') 220 193 221 194 def _extract_content_type(self, entry: feedparser.FeedParserDict) -> str: 222 195 """Extract content type from entry.""" 223 - if hasattr(entry, "content") and entry.content: 224 - content_type = entry.content[0].get("type", "html") 196 + if hasattr(entry, 'content') and entry.content: 197 + content_type = entry.content[0].get('type', 'html') 225 198 # Normalize content type 226 - if content_type in ["text/html", "html"]: 227 - return "html" 228 - elif content_type in ["text/plain", "text"]: 229 - return "text" 230 - elif content_type == "xhtml": 231 - return "xhtml" 232 - return "html" 199 + if content_type in ['text/html', 'html']: 200 + return 'html' 201 + elif content_type in ['text/plain', 'text']: 202 + return 'text' 203 + elif content_type == 'xhtml': 204 + return 'xhtml' 205 + return 'html' 233 206 234 207 def _extract_author(self, entry: feedparser.FeedParserDict) -> Optional[dict]: 235 208 """Extract author information from entry.""" 236 209 author = {} 237 210 238 - if hasattr(entry, "author_detail"): 239 - author.update( 240 - { 241 - "name": entry.author_detail.get("name"), 242 - "email": entry.author_detail.get("email"), 243 - "uri": entry.author_detail.get("href"), 244 - } 245 - ) 246 - elif hasattr(entry, "author"): 247 - author["name"] = entry.author 211 + if hasattr(entry, 'author_detail'): 212 + author.update({ 213 + 'name': entry.author_detail.get('name'), 214 + 'email': entry.author_detail.get('email'), 215 + 'uri': entry.author_detail.get('href'), 216 + }) 217 + elif hasattr(entry, 'author'): 218 + author['name'] = entry.author 248 219 249 220 return author if author else None 250 221 ··· 265 236 # Start with the path component 266 237 if parsed.path: 267 238 # Remove leading slash and replace problematic characters 268 - safe_id = parsed.path.lstrip("/").replace("/", "_").replace("\\", "_") 239 + safe_id = parsed.path.lstrip('/').replace('/', '_').replace('\\', '_') 269 240 else: 270 241 # Use the entire ID as fallback 271 242 safe_id = entry_id ··· 273 244 # Replace problematic characters 274 245 safe_chars = [] 275 246 for char in safe_id: 276 - if char.isalnum() or char in "-_.": 247 + if char.isalnum() or char in '-_.': 277 248 safe_chars.append(char) 278 249 else: 279 - safe_chars.append("_") 250 + safe_chars.append('_') 280 251 281 - safe_id = "".join(safe_chars) 252 + safe_id = ''.join(safe_chars) 282 253 283 254 # Ensure it's not too long (max 200 chars) 284 255 if len(safe_id) > 200:

+18 -107

src/thicket/core/git_store.py

··· 53 53 """Save the index to index.json.""" 54 54 index_path = self.repo_path / "index.json" 55 55 with open(index_path, "w") as f: 56 - json.dump( 57 - index.model_dump(mode="json", exclude_none=True), 58 - f, 59 - indent=2, 60 - default=str, 61 - ) 56 + json.dump(index.model_dump(mode="json", exclude_none=True), f, indent=2, default=str) 62 57 63 58 def _load_index(self) -> GitStoreIndex: 64 59 """Load the index from index.json.""" ··· 91 86 92 87 return DuplicateMap(**data) 93 88 94 - def add_user( 95 - self, 96 - username: str, 97 - display_name: Optional[str] = None, 98 - email: Optional[str] = None, 99 - homepage: Optional[str] = None, 100 - icon: Optional[str] = None, 101 - feeds: Optional[list[str]] = None, 102 - ) -> UserMetadata: 89 + def add_user(self, username: str, display_name: Optional[str] = None, 90 + email: Optional[str] = None, homepage: Optional[str] = None, 91 + icon: Optional[str] = None, feeds: Optional[list[str]] = None) -> UserMetadata: 103 92 """Add a new user to the Git store.""" 104 93 index = self._load_index() 105 94 ··· 119 108 created=datetime.now(), 120 109 last_updated=datetime.now(), 121 110 ) 111 + 122 112 123 113 # Update index 124 114 index.add_user(user_metadata) ··· 146 136 147 137 user.update_timestamp() 148 138 139 + 149 140 # Update index 150 141 index.add_user(user) 151 142 self._save_index(index) 152 143 153 144 return True 154 145 155 - def add_zulip_association(self, username: str, server: str, user_id: str) -> bool: 156 - """Add a Zulip association to a user.""" 157 - index = self._load_index() 158 - user = index.get_user(username) 159 - 160 - if not user: 161 - return False 162 - 163 - result = user.add_zulip_association(server, user_id) 164 - if result: 165 - index.add_user(user) 166 - self._save_index(index) 167 - 168 - return result 169 - 170 - def remove_zulip_association( 171 - self, username: str, server: str, user_id: str 172 - ) -> bool: 173 - """Remove a Zulip association from a user.""" 174 - index = self._load_index() 175 - user = index.get_user(username) 176 - 177 - if not user: 178 - return False 179 - 180 - result = user.remove_zulip_association(server, user_id) 181 - if result: 182 - index.add_user(user) 183 - self._save_index(index) 184 - 185 - return result 186 - 187 - def get_zulip_associations(self, username: str) -> list: 188 - """Get all Zulip associations for a user.""" 189 - user = self.get_user(username) 190 - if user: 191 - return user.zulip_associations 192 - return [] 193 - 194 146 def store_entry(self, username: str, entry: AtomEntry) -> bool: 195 147 """Store an entry in the user's directory.""" 196 148 user = self.get_user(username) ··· 199 151 200 152 # Sanitize entry ID for filename 201 153 from .feed_parser import FeedParser 202 - 203 154 parser = FeedParser() 204 155 safe_id = parser.sanitize_entry_id(entry.id) 205 156 ··· 212 163 213 164 # Save entry 214 165 with open(entry_path, "w") as f: 215 - json.dump( 216 - entry.model_dump(mode="json", exclude_none=True), 217 - f, 218 - indent=2, 219 - default=str, 220 - ) 166 + json.dump(entry.model_dump(mode="json", exclude_none=True), f, indent=2, default=str) 221 167 222 168 # Update user metadata if new entry 223 169 if not entry_exists: ··· 235 181 236 182 # Sanitize entry ID 237 183 from .feed_parser import FeedParser 238 - 239 184 parser = FeedParser() 240 185 safe_id = parser.sanitize_entry_id(entry_id) 241 186 ··· 248 193 249 194 return AtomEntry(**data) 250 195 251 - def list_entries( 252 - self, username: str, limit: Optional[int] = None 253 - ) -> list[AtomEntry]: 196 + def list_entries(self, username: str, limit: Optional[int] = None) -> list[AtomEntry]: 254 197 """List entries for a user.""" 255 198 user = self.get_user(username) 256 199 if not user: ··· 261 204 return [] 262 205 263 206 entries = [] 264 - entry_files = sorted( 265 - user_dir.glob("*.json"), key=lambda p: p.stat().st_mtime, reverse=True 266 - ) 207 + entry_files = sorted(user_dir.glob("*.json"), key=lambda p: p.stat().st_mtime, reverse=True) 208 + 267 209 268 210 if limit: 269 211 entry_files = entry_files[:limit] ··· 318 260 "total_entries": index.total_entries, 319 261 "total_duplicates": len(duplicates.duplicates), 320 262 "last_updated": index.last_updated, 321 - "repository_size": sum( 322 - f.stat().st_size for f in self.repo_path.rglob("*") if f.is_file() 323 - ), 263 + "repository_size": sum(f.stat().st_size for f in self.repo_path.rglob("*") if f.is_file()), 324 264 } 325 265 326 - def search_entries( 327 - self, query: str, username: Optional[str] = None, limit: Optional[int] = None 328 - ) -> list[tuple[str, AtomEntry]]: 266 + def search_entries(self, query: str, username: Optional[str] = None, 267 + limit: Optional[int] = None) -> list[tuple[str, AtomEntry]]: 329 268 """Search entries by content.""" 330 269 results = [] 331 270 ··· 349 288 entry = AtomEntry(**data) 350 289 351 290 # Simple text search in title, summary, and content 352 - searchable_text = " ".join( 353 - filter( 354 - None, 355 - [ 356 - entry.title, 357 - entry.summary or "", 358 - entry.content or "", 359 - ], 360 - ) 361 - ).lower() 291 + searchable_text = " ".join(filter(None, [ 292 + entry.title, 293 + entry.summary or "", 294 + entry.content or "", 295 + ])).lower() 362 296 363 297 if query.lower() in searchable_text: 364 298 results.append((user.username, entry)) ··· 374 308 results.sort(key=lambda x: x[1].updated, reverse=True) 375 309 376 310 return results[:limit] if limit else results 377 - 378 - def list_users(self) -> list[str]: 379 - """Get list of all usernames in the git store.""" 380 - index = self._load_index() 381 - return list(index.users.keys()) 382 - 383 - def get_user_feeds(self, username: str) -> list[str]: 384 - """Get list of feed URLs for a specific user from their metadata.""" 385 - user = self.get_user(username) 386 - if not user: 387 - return [] 388 - 389 - # Feed URLs are stored in the user metadata 390 - return user.feeds 391 - 392 - def list_all_users_with_feeds(self) -> list[tuple[str, list[str]]]: 393 - """Get all users and their feed URLs.""" 394 - result = [] 395 - for username in self.list_users(): 396 - feeds = self.get_user_feeds(username) 397 - if feeds: # Only include users that have feeds configured 398 - result.append((username, feeds)) 399 - return result

-166

src/thicket/core/opml_generator.py

··· 1 - """OPML generation for thicket.""" 2 - 3 - import xml.etree.ElementTree as ET 4 - from datetime import datetime 5 - from pathlib import Path 6 - from typing import Optional 7 - from xml.dom import minidom 8 - 9 - from ..models import UserMetadata 10 - 11 - 12 - class OPMLGenerator: 13 - """Generates OPML files from feed collections.""" 14 - 15 - def __init__(self) -> None: 16 - """Initialize the OPML generator.""" 17 - pass 18 - 19 - def generate_opml( 20 - self, 21 - users: dict[str, UserMetadata], 22 - title: str = "Thicket Feeds", 23 - output_path: Optional[Path] = None, 24 - ) -> str: 25 - """Generate OPML XML content from user metadata. 26 - 27 - Args: 28 - users: Dictionary of username -> UserMetadata 29 - title: Title for the OPML file 30 - output_path: Optional path to write the OPML file 31 - 32 - Returns: 33 - OPML XML content as string 34 - """ 35 - # Create root OPML element 36 - opml = ET.Element("opml", version="2.0") 37 - 38 - # Create head section 39 - head = ET.SubElement(opml, "head") 40 - title_elem = ET.SubElement(head, "title") 41 - title_elem.text = title 42 - 43 - date_created = ET.SubElement(head, "dateCreated") 44 - date_created.text = datetime.now().strftime("%a, %d %b %Y %H:%M:%S %z") 45 - 46 - date_modified = ET.SubElement(head, "dateModified") 47 - date_modified.text = datetime.now().strftime("%a, %d %b %Y %H:%M:%S %z") 48 - 49 - # Create body section 50 - body = ET.SubElement(opml, "body") 51 - 52 - # Add each user as an outline with their feeds as sub-outlines 53 - for username, user_metadata in sorted(users.items()): 54 - user_outline = ET.SubElement(body, "outline") 55 - user_outline.set("text", user_metadata.display_name or username) 56 - user_outline.set("title", user_metadata.display_name or username) 57 - 58 - # Add user metadata as attributes if available 59 - if user_metadata.homepage: 60 - user_outline.set("htmlUrl", user_metadata.homepage) 61 - if user_metadata.email: 62 - user_outline.set("email", user_metadata.email) 63 - 64 - # Add each feed as a sub-outline 65 - for feed_url in sorted(user_metadata.feeds): 66 - feed_outline = ET.SubElement(user_outline, "outline") 67 - feed_outline.set("type", "rss") 68 - feed_outline.set("text", feed_url) 69 - feed_outline.set("title", feed_url) 70 - feed_outline.set("xmlUrl", feed_url) 71 - feed_outline.set("htmlUrl", feed_url) 72 - 73 - # Convert to pretty-printed XML string 74 - xml_str = self._prettify_xml(opml) 75 - 76 - # Write to file if path provided 77 - if output_path: 78 - output_path.write_text(xml_str, encoding="utf-8") 79 - 80 - return xml_str 81 - 82 - def _prettify_xml(self, elem: ET.Element) -> str: 83 - """Return a pretty-printed XML string for the Element.""" 84 - rough_string = ET.tostring(elem, encoding="unicode") 85 - reparsed = minidom.parseString(rough_string) 86 - return reparsed.toprettyxml(indent=" ") 87 - 88 - def generate_flat_opml( 89 - self, 90 - users: dict[str, UserMetadata], 91 - title: str = "Thicket Feeds (Flat)", 92 - output_path: Optional[Path] = None, 93 - ) -> str: 94 - """Generate a flat OPML file with all feeds at the top level. 95 - 96 - This format may be more compatible with some feed readers. 97 - 98 - Args: 99 - users: Dictionary of username -> UserMetadata 100 - title: Title for the OPML file 101 - output_path: Optional path to write the OPML file 102 - 103 - Returns: 104 - OPML XML content as string 105 - """ 106 - # Create root OPML element 107 - opml = ET.Element("opml", version="2.0") 108 - 109 - # Create head section 110 - head = ET.SubElement(opml, "head") 111 - title_elem = ET.SubElement(head, "title") 112 - title_elem.text = title 113 - 114 - date_created = ET.SubElement(head, "dateCreated") 115 - date_created.text = datetime.now().strftime("%a, %d %b %Y %H:%M:%S %z") 116 - 117 - date_modified = ET.SubElement(head, "dateModified") 118 - date_modified.text = datetime.now().strftime("%a, %d %b %Y %H:%M:%S %z") 119 - 120 - # Create body section 121 - body = ET.SubElement(opml, "body") 122 - 123 - # Collect all feeds with their associated user info 124 - all_feeds = [] 125 - for username, user_metadata in users.items(): 126 - for feed_url in user_metadata.feeds: 127 - all_feeds.append( 128 - { 129 - "url": feed_url, 130 - "username": username, 131 - "display_name": user_metadata.display_name or username, 132 - "homepage": user_metadata.homepage, 133 - } 134 - ) 135 - 136 - # Sort feeds by URL for consistency 137 - all_feeds.sort(key=lambda f: f["url"] or "") 138 - 139 - # Add each feed as a top-level outline 140 - for feed_info in all_feeds: 141 - feed_outline = ET.SubElement(body, "outline") 142 - feed_outline.set("type", "rss") 143 - 144 - # Create a descriptive title that includes the user 145 - title_text = f"{feed_info['display_name']}: {feed_info['url']}" 146 - feed_outline.set("text", title_text) 147 - feed_outline.set("title", title_text) 148 - url = feed_info["url"] or "" 149 - feed_outline.set("xmlUrl", url) 150 - homepage_url = feed_info.get("homepage") or url 151 - feed_outline.set("htmlUrl", homepage_url or "") 152 - 153 - # Add custom attributes for user info 154 - feed_outline.set("thicketUser", feed_info["username"] or "") 155 - homepage = feed_info.get("homepage") 156 - if homepage: 157 - feed_outline.set("thicketHomepage", homepage) 158 - 159 - # Convert to pretty-printed XML string 160 - xml_str = self._prettify_xml(opml) 161 - 162 - # Write to file if path provided 163 - if output_path: 164 - output_path.write_text(xml_str, encoding="utf-8") 165 - 166 - return xml_str

+438

src/thicket/core/reference_parser.py

··· 1 + """Reference detection and parsing for blog entries.""" 2 + 3 + import re 4 + from typing import Optional 5 + from urllib.parse import urlparse 6 + 7 + from ..models import AtomEntry 8 + 9 + 10 + class BlogReference: 11 + """Represents a reference from one blog entry to another.""" 12 + 13 + def __init__( 14 + self, 15 + source_entry_id: str, 16 + source_username: str, 17 + target_url: str, 18 + target_username: Optional[str] = None, 19 + target_entry_id: Optional[str] = None, 20 + ): 21 + self.source_entry_id = source_entry_id 22 + self.source_username = source_username 23 + self.target_url = target_url 24 + self.target_username = target_username 25 + self.target_entry_id = target_entry_id 26 + 27 + def to_dict(self) -> dict: 28 + """Convert to dictionary for JSON serialization.""" 29 + result = { 30 + "source_entry_id": self.source_entry_id, 31 + "source_username": self.source_username, 32 + "target_url": self.target_url, 33 + } 34 + 35 + # Only include optional fields if they are not None 36 + if self.target_username is not None: 37 + result["target_username"] = self.target_username 38 + if self.target_entry_id is not None: 39 + result["target_entry_id"] = self.target_entry_id 40 + 41 + return result 42 + 43 + @classmethod 44 + def from_dict(cls, data: dict) -> "BlogReference": 45 + """Create from dictionary.""" 46 + return cls( 47 + source_entry_id=data["source_entry_id"], 48 + source_username=data["source_username"], 49 + target_url=data["target_url"], 50 + target_username=data.get("target_username"), 51 + target_entry_id=data.get("target_entry_id"), 52 + ) 53 + 54 + 55 + class ReferenceIndex: 56 + """Index of blog-to-blog references for creating threaded views.""" 57 + 58 + def __init__(self): 59 + self.references: list[BlogReference] = [] 60 + self.outbound_refs: dict[ 61 + str, list[BlogReference] 62 + ] = {} # entry_id -> outbound refs 63 + self.inbound_refs: dict[ 64 + str, list[BlogReference] 65 + ] = {} # entry_id -> inbound refs 66 + self.user_domains: dict[str, set[str]] = {} # username -> set of domains 67 + 68 + def add_reference(self, ref: BlogReference) -> None: 69 + """Add a reference to the index.""" 70 + self.references.append(ref) 71 + 72 + # Update outbound references 73 + source_key = f"{ref.source_username}:{ref.source_entry_id}" 74 + if source_key not in self.outbound_refs: 75 + self.outbound_refs[source_key] = [] 76 + self.outbound_refs[source_key].append(ref) 77 + 78 + # Update inbound references if we can identify the target 79 + if ref.target_username and ref.target_entry_id: 80 + target_key = f"{ref.target_username}:{ref.target_entry_id}" 81 + if target_key not in self.inbound_refs: 82 + self.inbound_refs[target_key] = [] 83 + self.inbound_refs[target_key].append(ref) 84 + 85 + def get_outbound_refs(self, username: str, entry_id: str) -> list[BlogReference]: 86 + """Get all outbound references from an entry.""" 87 + key = f"{username}:{entry_id}" 88 + return self.outbound_refs.get(key, []) 89 + 90 + def get_inbound_refs(self, username: str, entry_id: str) -> list[BlogReference]: 91 + """Get all inbound references to an entry.""" 92 + key = f"{username}:{entry_id}" 93 + return self.inbound_refs.get(key, []) 94 + 95 + def get_thread_members(self, username: str, entry_id: str) -> set[tuple[str, str]]: 96 + """Get all entries that are part of the same thread.""" 97 + visited = set() 98 + to_visit = [(username, entry_id)] 99 + thread_members = set() 100 + 101 + while to_visit: 102 + current_user, current_entry = to_visit.pop() 103 + if (current_user, current_entry) in visited: 104 + continue 105 + 106 + visited.add((current_user, current_entry)) 107 + thread_members.add((current_user, current_entry)) 108 + 109 + # Add outbound references 110 + for ref in self.get_outbound_refs(current_user, current_entry): 111 + if ref.target_username and ref.target_entry_id: 112 + to_visit.append((ref.target_username, ref.target_entry_id)) 113 + 114 + # Add inbound references 115 + for ref in self.get_inbound_refs(current_user, current_entry): 116 + to_visit.append((ref.source_username, ref.source_entry_id)) 117 + 118 + return thread_members 119 + 120 + def to_dict(self) -> dict: 121 + """Convert to dictionary for JSON serialization.""" 122 + return { 123 + "references": [ref.to_dict() for ref in self.references], 124 + "user_domains": {k: list(v) for k, v in self.user_domains.items()}, 125 + } 126 + 127 + @classmethod 128 + def from_dict(cls, data: dict) -> "ReferenceIndex": 129 + """Create from dictionary.""" 130 + index = cls() 131 + for ref_data in data.get("references", []): 132 + ref = BlogReference.from_dict(ref_data) 133 + index.add_reference(ref) 134 + 135 + for username, domains in data.get("user_domains", {}).items(): 136 + index.user_domains[username] = set(domains) 137 + 138 + return index 139 + 140 + 141 + class ReferenceParser: 142 + """Parses blog entries to detect references to other blogs.""" 143 + 144 + def __init__(self): 145 + # Common blog platforms and patterns 146 + self.blog_patterns = [ 147 + r"https?://[^/]+\.(?:org|com|net|io|dev|me|co\.uk)/.*", # Common blog domains 148 + r"https?://[^/]+\.github\.io/.*", # GitHub Pages 149 + r"https?://[^/]+\.substack\.com/.*", # Substack 150 + r"https?://medium\.com/.*", # Medium 151 + r"https?://[^/]+\.wordpress\.com/.*", # WordPress.com 152 + r"https?://[^/]+\.blogspot\.com/.*", # Blogger 153 + ] 154 + 155 + # Compile regex patterns 156 + self.link_pattern = re.compile( 157 + r'<a[^>]+href="([^"]+)"[^>]*>(.*?)</a>', re.IGNORECASE | re.DOTALL 158 + ) 159 + self.url_pattern = re.compile(r'https?://[^\s<>"]+') 160 + 161 + def extract_links_from_html(self, html_content: str) -> list[tuple[str, str]]: 162 + """Extract all links from HTML content.""" 163 + links = [] 164 + 165 + # Extract links from <a> tags 166 + for match in self.link_pattern.finditer(html_content): 167 + url = match.group(1) 168 + text = re.sub( 169 + r"<[^>]+>", "", match.group(2) 170 + ).strip() # Remove HTML tags from link text 171 + links.append((url, text)) 172 + 173 + return links 174 + 175 + def is_blog_url(self, url: str) -> bool: 176 + """Check if a URL likely points to a blog post.""" 177 + for pattern in self.blog_patterns: 178 + if re.match(pattern, url): 179 + return True 180 + return False 181 + 182 + def _is_likely_blog_post_url(self, url: str) -> bool: 183 + """Check if a same-domain URL likely points to a blog post (not CSS, images, etc.).""" 184 + parsed_url = urlparse(url) 185 + path = parsed_url.path.lower() 186 + 187 + # Skip obvious non-blog content 188 + if any(path.endswith(ext) for ext in ['.css', '.js', '.png', '.jpg', '.jpeg', '.gif', '.svg', '.ico', '.pdf', '.xml', '.json']): 189 + return False 190 + 191 + # Skip common non-blog paths 192 + if any(segment in path for segment in ['/static/', '/assets/', '/css/', '/js/', '/images/', '/img/', '/media/', '/uploads/']): 193 + return False 194 + 195 + # Skip fragment-only links (same page anchors) 196 + if not path or path == '/': 197 + return False 198 + 199 + # Look for positive indicators of blog posts 200 + # Common blog post patterns: dates, slugs, post indicators 201 + blog_indicators = [ 202 + r'/\d{4}/', # Year in path 203 + r'/\d{4}/\d{2}/', # Year/month in path 204 + r'/blog/', 205 + r'/post/', 206 + r'/posts/', 207 + r'/articles?/', 208 + r'/notes?/', 209 + r'/entries/', 210 + r'/writing/', 211 + ] 212 + 213 + for pattern in blog_indicators: 214 + if re.search(pattern, path): 215 + return True 216 + 217 + # If it has a reasonable path depth and doesn't match exclusions, likely a blog post 218 + path_segments = [seg for seg in path.split('/') if seg] 219 + return len(path_segments) >= 1 # At least one meaningful path segment 220 + 221 + def resolve_target_user( 222 + self, url: str, user_domains: dict[str, set[str]] 223 + ) -> Optional[str]: 224 + """Try to resolve a URL to a known user based on domain mapping.""" 225 + parsed_url = urlparse(url) 226 + domain = parsed_url.netloc.lower() 227 + 228 + for username, domains in user_domains.items(): 229 + if domain in domains: 230 + return username 231 + 232 + return None 233 + 234 + def extract_references( 235 + self, entry: AtomEntry, username: str, user_domains: dict[str, set[str]] 236 + ) -> list[BlogReference]: 237 + """Extract all blog references from an entry.""" 238 + references = [] 239 + 240 + # Combine all text content for analysis 241 + content_to_search = [] 242 + if entry.content: 243 + content_to_search.append(entry.content) 244 + if entry.summary: 245 + content_to_search.append(entry.summary) 246 + 247 + for content in content_to_search: 248 + links = self.extract_links_from_html(content) 249 + 250 + for url, _link_text in links: 251 + entry_domain = ( 252 + urlparse(str(entry.link)).netloc.lower() if entry.link else "" 253 + ) 254 + link_domain = urlparse(url).netloc.lower() 255 + 256 + # Check if this looks like a blog URL 257 + if not self.is_blog_url(url): 258 + continue 259 + 260 + # For same-domain links, apply additional filtering to avoid non-blog content 261 + if link_domain == entry_domain: 262 + # Only include same-domain links that look like blog posts 263 + if not self._is_likely_blog_post_url(url): 264 + continue 265 + 266 + # Try to resolve to a known user 267 + if link_domain == entry_domain: 268 + # Same domain - target user is the same as source user 269 + target_username: Optional[str] = username 270 + else: 271 + # Different domain - try to resolve 272 + target_username = self.resolve_target_user(url, user_domains) 273 + 274 + ref = BlogReference( 275 + source_entry_id=entry.id, 276 + source_username=username, 277 + target_url=url, 278 + target_username=target_username, 279 + target_entry_id=None, # Will be resolved later if possible 280 + ) 281 + 282 + references.append(ref) 283 + 284 + return references 285 + 286 + def build_user_domain_mapping(self, git_store: "GitStore") -> dict[str, set[str]]: 287 + """Build mapping of usernames to their known domains.""" 288 + user_domains = {} 289 + index = git_store._load_index() 290 + 291 + for username, user_metadata in index.users.items(): 292 + domains = set() 293 + 294 + # Add domains from feeds 295 + for feed_url in user_metadata.feeds: 296 + domain = urlparse(feed_url).netloc.lower() 297 + if domain: 298 + domains.add(domain) 299 + 300 + # Add domain from homepage 301 + if user_metadata.homepage: 302 + domain = urlparse(str(user_metadata.homepage)).netloc.lower() 303 + if domain: 304 + domains.add(domain) 305 + 306 + user_domains[username] = domains 307 + 308 + return user_domains 309 + 310 + def _build_url_to_entry_mapping(self, git_store: "GitStore") -> dict[str, str]: 311 + """Build a comprehensive mapping from URLs to entry IDs using git store data. 312 + 313 + This creates a bidirectional mapping that handles: 314 + - Entry link URLs -> Entry IDs 315 + - URL variations (with/without www, http/https) 316 + - Multiple URLs pointing to the same entry 317 + """ 318 + url_to_entry: dict[str, str] = {} 319 + 320 + # Load index to get all users 321 + index = git_store._load_index() 322 + 323 + for username in index.users.keys(): 324 + entries = git_store.list_entries(username) 325 + 326 + for entry in entries: 327 + if entry.link: 328 + link_url = str(entry.link) 329 + entry_id = entry.id 330 + 331 + # Map the canonical link URL 332 + url_to_entry[link_url] = entry_id 333 + 334 + # Handle common URL variations 335 + parsed = urlparse(link_url) 336 + if parsed.netloc and parsed.path: 337 + # Add version without www 338 + if parsed.netloc.startswith('www.'): 339 + no_www_url = f"{parsed.scheme}://{parsed.netloc[4:]}{parsed.path}" 340 + if parsed.query: 341 + no_www_url += f"?{parsed.query}" 342 + if parsed.fragment: 343 + no_www_url += f"#{parsed.fragment}" 344 + url_to_entry[no_www_url] = entry_id 345 + 346 + # Add version with www if not present 347 + elif not parsed.netloc.startswith('www.'): 348 + www_url = f"{parsed.scheme}://www.{parsed.netloc}{parsed.path}" 349 + if parsed.query: 350 + www_url += f"?{parsed.query}" 351 + if parsed.fragment: 352 + www_url += f"#{parsed.fragment}" 353 + url_to_entry[www_url] = entry_id 354 + 355 + # Add http/https variations 356 + if parsed.scheme == 'https': 357 + http_url = link_url.replace('https://', 'http://', 1) 358 + url_to_entry[http_url] = entry_id 359 + elif parsed.scheme == 'http': 360 + https_url = link_url.replace('http://', 'https://', 1) 361 + url_to_entry[https_url] = entry_id 362 + 363 + return url_to_entry 364 + 365 + def _normalize_url(self, url: str) -> str: 366 + """Normalize URL for consistent matching. 367 + 368 + Handles common variations like trailing slashes, fragments, etc. 369 + """ 370 + parsed = urlparse(url) 371 + 372 + # Remove trailing slash from path 373 + path = parsed.path.rstrip('/') if parsed.path != '/' else parsed.path 374 + 375 + # Reconstruct without fragment for consistent matching 376 + normalized = f"{parsed.scheme}://{parsed.netloc}{path}" 377 + if parsed.query: 378 + normalized += f"?{parsed.query}" 379 + 380 + return normalized 381 + 382 + def resolve_target_entry_ids( 383 + self, references: list[BlogReference], git_store: "GitStore" 384 + ) -> list[BlogReference]: 385 + """Resolve target_entry_id for references using comprehensive URL mapping.""" 386 + resolved_refs = [] 387 + 388 + # Build comprehensive URL to entry ID mapping 389 + url_to_entry = self._build_url_to_entry_mapping(git_store) 390 + 391 + for ref in references: 392 + # If we already have a target_entry_id, keep the reference as-is 393 + if ref.target_entry_id is not None: 394 + resolved_refs.append(ref) 395 + continue 396 + 397 + # If we don't have a target_username, we can't resolve it 398 + if ref.target_username is None: 399 + resolved_refs.append(ref) 400 + continue 401 + 402 + # Try to resolve using URL mapping 403 + resolved_entry_id = None 404 + 405 + # First, try exact match 406 + if ref.target_url in url_to_entry: 407 + resolved_entry_id = url_to_entry[ref.target_url] 408 + else: 409 + # Try normalized URL matching 410 + normalized_target = self._normalize_url(ref.target_url) 411 + if normalized_target in url_to_entry: 412 + resolved_entry_id = url_to_entry[normalized_target] 413 + else: 414 + # Try URL variations 415 + for mapped_url, entry_id in url_to_entry.items(): 416 + if self._normalize_url(mapped_url) == normalized_target: 417 + resolved_entry_id = entry_id 418 + break 419 + 420 + # Verify the resolved entry belongs to the target username 421 + if resolved_entry_id: 422 + # Double-check by loading the actual entry 423 + entries = git_store.list_entries(ref.target_username) 424 + entry_found = any(entry.id == resolved_entry_id for entry in entries) 425 + if not entry_found: 426 + resolved_entry_id = None 427 + 428 + # Create a new reference with the resolved target_entry_id 429 + resolved_ref = BlogReference( 430 + source_entry_id=ref.source_entry_id, 431 + source_username=ref.source_username, 432 + target_url=ref.target_url, 433 + target_username=ref.target_username, 434 + target_entry_id=resolved_entry_id, 435 + ) 436 + resolved_refs.append(resolved_ref) 437 + 438 + return resolved_refs

-428

src/thicket/core/typesense_client.py

··· 1 - """Typesense integration for thicket.""" 2 - 3 - import json 4 - import logging 5 - from datetime import datetime 6 - from typing import Any, Optional 7 - from urllib.parse import urlparse 8 - 9 - import typesense 10 - from pydantic import BaseModel, ConfigDict 11 - 12 - from ..models.config import ThicketConfig, UserConfig 13 - from ..models.feed import AtomEntry 14 - from ..models.user import UserMetadata 15 - from .git_store import GitStore 16 - 17 - logger = logging.getLogger(__name__) 18 - 19 - 20 - class TypesenseConfig(BaseModel): 21 - """Configuration for Typesense connection.""" 22 - 23 - model_config = ConfigDict(str_strip_whitespace=True) 24 - 25 - host: str 26 - port: int = 8108 27 - protocol: str = "http" 28 - api_key: str 29 - connection_timeout: int = 5 30 - collection_name: str = "thicket_entries" 31 - 32 - @classmethod 33 - def from_url( 34 - cls, url: str, api_key: str, collection_name: str = "thicket_entries" 35 - ) -> "TypesenseConfig": 36 - """Create config from Typesense URL.""" 37 - parsed = urlparse(url) 38 - return cls( 39 - host=parsed.hostname or "localhost", 40 - port=parsed.port or (443 if parsed.scheme == "https" else 8108), 41 - protocol=parsed.scheme or "http", 42 - api_key=api_key, 43 - collection_name=collection_name, 44 - ) 45 - 46 - 47 - class TypesenseDocument(BaseModel): 48 - """Document model for Typesense indexing.""" 49 - 50 - model_config = ConfigDict( 51 - json_encoders={datetime: lambda v: int(v.timestamp())}, 52 - str_strip_whitespace=True, 53 - ) 54 - 55 - # Primary fields from AtomEntry 56 - id: str # Sanitized entry ID 57 - original_id: str # Original Atom ID 58 - title: str 59 - link: str 60 - updated: int # Unix timestamp 61 - published: Optional[int] = None # Unix timestamp 62 - summary: Optional[str] = None 63 - content: Optional[str] = None 64 - content_type: str = "html" 65 - categories: list[str] = [] 66 - rights: Optional[str] = None 67 - source: Optional[str] = None 68 - 69 - # User/feed metadata 70 - username: str 71 - user_display_name: Optional[str] = None 72 - user_email: Optional[str] = None 73 - user_homepage: Optional[str] = None 74 - user_icon: Optional[str] = None 75 - 76 - # Author information from entry 77 - author_name: Optional[str] = None 78 - author_email: Optional[str] = None 79 - author_uri: Optional[str] = None 80 - 81 - # Searchable text fields for embedding/semantic search 82 - searchable_content: str # Combined title + summary + content 83 - searchable_metadata: str # Combined user info + categories + author 84 - 85 - @classmethod 86 - def from_atom_entry_with_metadata( 87 - cls, 88 - entry: AtomEntry, 89 - sanitized_id: str, 90 - user_metadata: "UserMetadata", # Import will be added at top 91 - ) -> "TypesenseDocument": 92 - """Create TypesenseDocument from AtomEntry and UserMetadata from git store.""" 93 - # Extract author information if available 94 - author_name = None 95 - author_email = None 96 - author_uri = None 97 - if entry.author: 98 - author_name = entry.author.get("name") 99 - author_email = entry.author.get("email") 100 - author_uri = entry.author.get("uri") 101 - 102 - # Create searchable content combining all text fields 103 - content_parts = [entry.title] 104 - if entry.summary: 105 - content_parts.append(entry.summary) 106 - if entry.content: 107 - content_parts.append(entry.content) 108 - searchable_content = " ".join(content_parts) 109 - 110 - # Create searchable metadata 111 - metadata_parts = [user_metadata.username] 112 - if user_metadata.display_name: 113 - metadata_parts.append(user_metadata.display_name) 114 - if author_name: 115 - metadata_parts.append(author_name) 116 - if entry.categories: 117 - metadata_parts.extend(entry.categories) 118 - searchable_metadata = " ".join(metadata_parts) 119 - 120 - return cls( 121 - id=sanitized_id, 122 - original_id=entry.id, 123 - title=entry.title, 124 - link=str(entry.link), 125 - updated=int(entry.updated.timestamp()), 126 - published=int(entry.published.timestamp()) if entry.published else None, 127 - summary=entry.summary, 128 - content=entry.content, 129 - content_type=entry.content_type or "html", 130 - categories=entry.categories, 131 - rights=entry.rights, 132 - source=entry.source, 133 - username=user_metadata.username, 134 - user_display_name=user_metadata.display_name, 135 - user_email=user_metadata.email, 136 - user_homepage=user_metadata.homepage, 137 - user_icon=user_metadata.icon if user_metadata.icon != "None" else None, 138 - author_name=author_name, 139 - author_email=author_email, 140 - author_uri=author_uri, 141 - searchable_content=searchable_content, 142 - searchable_metadata=searchable_metadata, 143 - ) 144 - 145 - @classmethod 146 - def from_atom_entry( 147 - cls, 148 - entry: AtomEntry, 149 - sanitized_id: str, 150 - user_config: UserConfig, 151 - ) -> "TypesenseDocument": 152 - """Create TypesenseDocument from AtomEntry and UserConfig.""" 153 - # Extract author information if available 154 - author_name = None 155 - author_email = None 156 - author_uri = None 157 - if entry.author: 158 - author_name = entry.author.get("name") 159 - author_email = entry.author.get("email") 160 - author_uri = entry.author.get("uri") 161 - 162 - # Create searchable content combining all text fields 163 - content_parts = [entry.title] 164 - if entry.summary: 165 - content_parts.append(entry.summary) 166 - if entry.content: 167 - content_parts.append(entry.content) 168 - searchable_content = " ".join(content_parts) 169 - 170 - # Create searchable metadata 171 - metadata_parts = [user_config.username] 172 - if user_config.display_name: 173 - metadata_parts.append(user_config.display_name) 174 - if author_name: 175 - metadata_parts.append(author_name) 176 - if entry.categories: 177 - metadata_parts.extend(entry.categories) 178 - searchable_metadata = " ".join(metadata_parts) 179 - 180 - return cls( 181 - id=sanitized_id, 182 - original_id=entry.id, 183 - title=entry.title, 184 - link=str(entry.link), 185 - updated=int(entry.updated.timestamp()), 186 - published=int(entry.published.timestamp()) if entry.published else None, 187 - summary=entry.summary, 188 - content=entry.content, 189 - content_type=entry.content_type or "html", 190 - categories=entry.categories, 191 - rights=entry.rights, 192 - source=entry.source, 193 - username=user_config.username, 194 - user_display_name=user_config.display_name, 195 - user_email=str(user_config.email) if user_config.email else None, 196 - user_homepage=str(user_config.homepage) if user_config.homepage else None, 197 - user_icon=str(user_config.icon) if user_config.icon else None, 198 - author_name=author_name, 199 - author_email=author_email, 200 - author_uri=author_uri, 201 - searchable_content=searchable_content, 202 - searchable_metadata=searchable_metadata, 203 - ) 204 - 205 - 206 - class TypesenseClient: 207 - """Client for interacting with Typesense search engine.""" 208 - 209 - def __init__(self, config: TypesenseConfig): 210 - """Initialize Typesense client.""" 211 - self.config = config 212 - self.client = typesense.Client( 213 - { 214 - "nodes": [ 215 - { 216 - "host": config.host, 217 - "port": config.port, 218 - "protocol": config.protocol, 219 - } 220 - ], 221 - "api_key": config.api_key, 222 - "connection_timeout_seconds": config.connection_timeout, 223 - } 224 - ) 225 - 226 - def get_collection_schema(self) -> dict[str, Any]: 227 - """Get the Typesense collection schema for thicket entries.""" 228 - return { 229 - "name": self.config.collection_name, 230 - "fields": [ 231 - # Primary identifiers 232 - {"name": "id", "type": "string", "facet": False}, 233 - {"name": "original_id", "type": "string", "facet": False}, 234 - # Content fields - optimized for search 235 - {"name": "title", "type": "string", "facet": False}, 236 - {"name": "summary", "type": "string", "optional": True, "facet": False}, 237 - {"name": "content", "type": "string", "optional": True, "facet": False}, 238 - {"name": "content_type", "type": "string", "facet": True}, 239 - # Searchable combined fields for embeddings/semantic search 240 - {"name": "searchable_content", "type": "string", "facet": False}, 241 - {"name": "searchable_metadata", "type": "string", "facet": False}, 242 - # Temporal fields 243 - {"name": "updated", "type": "int64", "facet": False, "sort": True}, 244 - { 245 - "name": "published", 246 - "type": "int64", 247 - "optional": True, 248 - "facet": False, 249 - "sort": True, 250 - }, 251 - # Link and source 252 - {"name": "link", "type": "string", "facet": False}, 253 - {"name": "source", "type": "string", "optional": True, "facet": False}, 254 - # Categories and classification 255 - { 256 - "name": "categories", 257 - "type": "string[]", 258 - "facet": True, 259 - "optional": True, 260 - }, 261 - {"name": "rights", "type": "string", "optional": True, "facet": False}, 262 - # User/feed metadata - facetable for filtering 263 - {"name": "username", "type": "string", "facet": True}, 264 - { 265 - "name": "user_display_name", 266 - "type": "string", 267 - "optional": True, 268 - "facet": True, 269 - }, 270 - { 271 - "name": "user_email", 272 - "type": "string", 273 - "optional": True, 274 - "facet": False, 275 - }, 276 - { 277 - "name": "user_homepage", 278 - "type": "string", 279 - "optional": True, 280 - "facet": False, 281 - }, 282 - { 283 - "name": "user_icon", 284 - "type": "string", 285 - "optional": True, 286 - "facet": False, 287 - }, 288 - # Author information from entries 289 - { 290 - "name": "author_name", 291 - "type": "string", 292 - "optional": True, 293 - "facet": True, 294 - }, 295 - { 296 - "name": "author_email", 297 - "type": "string", 298 - "optional": True, 299 - "facet": False, 300 - }, 301 - { 302 - "name": "author_uri", 303 - "type": "string", 304 - "optional": True, 305 - "facet": False, 306 - }, 307 - ], 308 - "default_sorting_field": "updated", 309 - } 310 - 311 - def create_collection(self) -> dict[str, Any]: 312 - """Create the Typesense collection with the appropriate schema.""" 313 - try: 314 - # Try to delete existing collection first 315 - try: 316 - self.client.collections[self.config.collection_name].delete() 317 - logger.info( 318 - f"Deleted existing collection: {self.config.collection_name}" 319 - ) 320 - except typesense.exceptions.ObjectNotFound: 321 - logger.info( 322 - f"Collection {self.config.collection_name} does not exist, creating new one" 323 - ) 324 - 325 - # Create new collection 326 - schema = self.get_collection_schema() 327 - result = self.client.collections.create(schema) 328 - logger.info(f"Created collection: {self.config.collection_name}") 329 - return result 330 - 331 - except Exception as e: 332 - logger.error(f"Failed to create collection: {e}") 333 - raise 334 - 335 - def index_documents(self, documents: list[TypesenseDocument]) -> dict[str, Any]: 336 - """Index a batch of documents in Typesense.""" 337 - try: 338 - # Convert documents to dict format for Typesense 339 - document_dicts = [doc.model_dump() for doc in documents] 340 - 341 - # Use import endpoint for batch indexing 342 - result = self.client.collections[ 343 - self.config.collection_name 344 - ].documents.import_( 345 - document_dicts, 346 - {"action": "upsert"}, # Update if exists, insert if not 347 - ) 348 - 349 - logger.info(f"Indexed {len(documents)} documents") 350 - return result 351 - 352 - except Exception as e: 353 - logger.error(f"Failed to index documents: {e}") 354 - raise 355 - 356 - def upload_from_git_store( 357 - self, git_store: GitStore, config: ThicketConfig 358 - ) -> dict[str, Any]: 359 - """Upload all entries from the Git store to Typesense.""" 360 - logger.info("Starting Typesense upload from Git store") 361 - 362 - # Create collection 363 - self.create_collection() 364 - 365 - documents = [] 366 - index = git_store._load_index() 367 - 368 - for username, user_metadata in index.users.items(): 369 - logger.info(f"Processing entries for user: {username}") 370 - 371 - # Load user entries from directory 372 - try: 373 - user_dir = git_store.repo_path / user_metadata.directory 374 - if not user_dir.exists(): 375 - logger.warning( 376 - f"Directory not found for user {username}: {user_dir}" 377 - ) 378 - continue 379 - 380 - entry_files = list(user_dir.glob("*.json")) 381 - logger.info(f"Found {len(entry_files)} entry files for {username}") 382 - 383 - for entry_file in entry_files: 384 - try: 385 - with open(entry_file) as f: 386 - data = json.load(f) 387 - 388 - entry = AtomEntry(**data) 389 - sanitized_id = entry_file.stem # filename without extension 390 - 391 - doc = TypesenseDocument.from_atom_entry_with_metadata( 392 - entry, sanitized_id, user_metadata 393 - ) 394 - documents.append(doc) 395 - except Exception as e: 396 - logger.error( 397 - f"Failed to convert entry {entry_file} to document: {e}" 398 - ) 399 - 400 - except Exception as e: 401 - logger.error(f"Failed to load entries for user {username}: {e}") 402 - 403 - if documents: 404 - logger.info(f"Uploading {len(documents)} documents to Typesense") 405 - result = self.index_documents(documents) 406 - logger.info("Upload completed successfully") 407 - return result 408 - else: 409 - logger.warning("No documents to upload") 410 - return {} 411 - 412 - def search( 413 - self, query: str, search_parameters: Optional[dict[str, Any]] = None 414 - ) -> dict[str, Any]: 415 - """Search the collection.""" 416 - default_params = { 417 - "q": query, 418 - "query_by": "title,searchable_content,searchable_metadata", 419 - "sort_by": "updated:desc", 420 - "per_page": 20, 421 - } 422 - 423 - if search_parameters: 424 - default_params.update(search_parameters) 425 - 426 - return self.client.collections[self.config.collection_name].documents.search( 427 - default_params 428 - )

+1 -2

src/thicket/models/__init__.py

··· 2 2 3 3 from .config import ThicketConfig, UserConfig 4 4 from .feed import AtomEntry, DuplicateMap, FeedMetadata 5 - from .user import GitStoreIndex, UserMetadata, ZulipAssociation 5 + from .user import GitStoreIndex, UserMetadata 6 6 7 7 __all__ = [ 8 8 "ThicketConfig", ··· 12 12 "FeedMetadata", 13 13 "GitStoreIndex", 14 14 "UserMetadata", 15 - "ZulipAssociation", 16 15 ]

+29 -25

src/thicket/models/config.py

··· 1 1 """Configuration models for thicket.""" 2 2 3 + import json 4 + import yaml 3 5 from pathlib import Path 4 - from typing import Optional 6 + from typing import Optional, Union 5 7 6 - from pydantic import BaseModel, EmailStr, HttpUrl 8 + from pydantic import BaseModel, EmailStr, HttpUrl, ValidationError 7 9 from pydantic_settings import BaseSettings, SettingsConfigDict 8 10 9 11 ··· 32 34 cache_dir: Path 33 35 users: list[UserConfig] = [] 34 36 35 - def find_user(self, username: str) -> Optional[UserConfig]: 36 - """Find a user by username.""" 37 - for user in self.users: 38 - if user.username == username: 39 - return user 40 - return None 41 - 42 - def add_user(self, user: UserConfig) -> bool: 43 - """Add a user to the configuration. Returns True if added, False if already exists.""" 44 - if self.find_user(user.username) is not None: 45 - return False 46 - self.users.append(user) 47 - return True 48 - 49 - def add_feed_to_user(self, username: str, feed_url: HttpUrl) -> bool: 50 - """Add a feed to an existing user. Returns True if added, False if user not found or feed already exists.""" 51 - user = self.find_user(username) 52 - if user is None: 53 - return False 54 - if feed_url in user.feeds: 55 - return False 56 - user.feeds.append(feed_url) 57 - return True 37 + @classmethod 38 + def from_file(cls, config_path: Path) -> 'ThicketConfig': 39 + """Load configuration from a file.""" 40 + if not config_path.exists(): 41 + raise FileNotFoundError(f"Configuration file not found: {config_path}") 42 + 43 + content = config_path.read_text(encoding='utf-8') 44 + 45 + if config_path.suffix.lower() in ['.yaml', '.yml']: 46 + try: 47 + data = yaml.safe_load(content) 48 + except yaml.YAMLError as e: 49 + raise ValueError(f"Invalid YAML in {config_path}: {e}") 50 + elif config_path.suffix.lower() == '.json': 51 + try: 52 + data = json.loads(content) 53 + except json.JSONDecodeError as e: 54 + raise ValueError(f"Invalid JSON in {config_path}: {e}") 55 + else: 56 + raise ValueError(f"Unsupported configuration file format: {config_path.suffix}") 57 + 58 + try: 59 + return cls(**data) 60 + except ValidationError as e: 61 + raise ValueError(f"Configuration validation error: {e}")

+2 -2

src/thicket/models/feed.py

··· 1 1 """Feed and entry models for thicket.""" 2 2 3 3 from datetime import datetime 4 - from typing import TYPE_CHECKING, Any, Optional 4 + from typing import TYPE_CHECKING, Optional 5 5 6 6 from pydantic import BaseModel, ConfigDict, EmailStr, HttpUrl 7 7 ··· 25 25 summary: Optional[str] = None 26 26 content: Optional[str] = None # Full body content from Atom entry 27 27 content_type: Optional[str] = "html" # text, html, xhtml 28 - author: Optional[dict[str, Any]] = None 28 + author: Optional[dict] = None 29 29 categories: list[str] = [] 30 30 rights: Optional[str] = None # Copyright info 31 31 source: Optional[str] = None # Source feed URL

+4 -41

src/thicket/models/user.py

··· 3 3 from datetime import datetime 4 4 from typing import Optional 5 5 6 - from pydantic import BaseModel, ConfigDict, Field 7 - 8 - 9 - class ZulipAssociation(BaseModel): 10 - """Association between a user and their Zulip identity.""" 11 - 12 - server: str # Zulip server URL (e.g., "yourorg.zulipchat.com") 13 - user_id: str # Zulip user ID or email for @mentions 14 - 15 - def __hash__(self) -> int: 16 - """Make hashable for use in sets.""" 17 - return hash((self.server, self.user_id)) 6 + from pydantic import BaseModel, ConfigDict 18 7 19 8 20 9 class UserMetadata(BaseModel): ··· 31 20 homepage: Optional[str] = None 32 21 icon: Optional[str] = None 33 22 feeds: list[str] = [] 34 - zulip_associations: list[ZulipAssociation] = Field( 35 - default_factory=list 36 - ) # Zulip server/user pairs 37 23 directory: str # Directory name in Git store 38 24 created: datetime 39 25 last_updated: datetime ··· 48 34 self.entry_count += count 49 35 self.update_timestamp() 50 36 51 - def add_zulip_association(self, server: str, user_id: str) -> bool: 52 - """Add a Zulip association if it doesn't exist. Returns True if added.""" 53 - association = ZulipAssociation(server=server, user_id=user_id) 54 - if association not in self.zulip_associations: 55 - self.zulip_associations.append(association) 56 - self.update_timestamp() 57 - return True 58 - return False 59 - 60 - def remove_zulip_association(self, server: str, user_id: str) -> bool: 61 - """Remove a Zulip association. Returns True if removed.""" 62 - association = ZulipAssociation(server=server, user_id=user_id) 63 - if association in self.zulip_associations: 64 - self.zulip_associations.remove(association) 65 - self.update_timestamp() 66 - return True 67 - return False 68 - 69 - def get_zulip_mention(self, server: str) -> Optional[str]: 70 - """Get the Zulip user_id for @mentions on a specific server.""" 71 - for association in self.zulip_associations: 72 - if association.server == server: 73 - return association.user_id 74 - return None 75 - 76 37 77 38 class GitStoreIndex(BaseModel): 78 39 """Index of all users and their directories in the Git store.""" 79 40 80 - model_config = ConfigDict(json_encoders={datetime: lambda v: v.isoformat()}) 41 + model_config = ConfigDict( 42 + json_encoders={datetime: lambda v: v.isoformat()} 43 + ) 81 44 82 45 users: dict[str, UserMetadata] = {} # username -> UserMetadata 83 46 created: datetime

+1

src/thicket/subsystems/__init__.py

··· 1 + """Thicket subsystems for specialized operations."""

+227

src/thicket/subsystems/feeds.py

··· 1 + """Feed management subsystem.""" 2 + 3 + import asyncio 4 + import json 5 + from datetime import datetime 6 + from pathlib import Path 7 + from typing import Callable, Optional 8 + 9 + from pydantic import HttpUrl 10 + 11 + from ..core.feed_parser import FeedParser 12 + from ..core.git_store import GitStore 13 + from ..models import AtomEntry, ThicketConfig 14 + 15 + 16 + class FeedManager: 17 + """Manages feed operations and caching.""" 18 + 19 + def __init__(self, git_store: GitStore, feed_parser: FeedParser, config: ThicketConfig): 20 + """Initialize feed manager.""" 21 + self.git_store = git_store 22 + self.feed_parser = feed_parser 23 + self.config = config 24 + self._ensure_cache_dir() 25 + 26 + def _ensure_cache_dir(self): 27 + """Ensure cache directory exists.""" 28 + self.config.cache_dir.mkdir(parents=True, exist_ok=True) 29 + 30 + async def sync_feeds(self, username: Optional[str] = None, progress_callback: Optional[Callable] = None) -> dict: 31 + """Sync feeds for all users or specific user.""" 32 + if username: 33 + return await self.sync_user_feeds(username, progress_callback) 34 + 35 + # Sync all users 36 + results = {} 37 + total_users = len(self.config.users) 38 + 39 + for i, user_config in enumerate(self.config.users): 40 + if progress_callback: 41 + progress_callback(f"Syncing feeds for {user_config.username}", i, total_users) 42 + 43 + user_results = await self.sync_user_feeds(user_config.username, progress_callback) 44 + results[user_config.username] = user_results 45 + 46 + return results 47 + 48 + async def sync_user_feeds(self, username: str, progress_callback: Optional[Callable] = None) -> dict: 49 + """Sync feeds for a specific user.""" 50 + user_config = next((u for u in self.config.users if u.username == username), None) 51 + if not user_config: 52 + return {'error': f'User {username} not found in configuration'} 53 + 54 + # Ensure user exists in git store 55 + git_user = self.git_store.get_user(username) 56 + if not git_user: 57 + self.git_store.add_user( 58 + username=user_config.username, 59 + display_name=user_config.display_name, 60 + email=str(user_config.email) if user_config.email else None, 61 + homepage=str(user_config.homepage) if user_config.homepage else None, 62 + icon=str(user_config.icon) if user_config.icon else None, 63 + feeds=[str(feed) for feed in user_config.feeds] 64 + ) 65 + 66 + results = { 67 + 'username': username, 68 + 'feeds_processed': 0, 69 + 'new_entries': 0, 70 + 'errors': [], 71 + 'feeds': {} 72 + } 73 + 74 + total_feeds = len(user_config.feeds) 75 + 76 + for i, feed_url in enumerate(user_config.feeds): 77 + if progress_callback: 78 + progress_callback(f"Processing feed {i+1}/{total_feeds} for {username}", i, total_feeds) 79 + 80 + try: 81 + feed_result = await self._sync_single_feed(username, feed_url) 82 + results['feeds'][str(feed_url)] = feed_result 83 + results['feeds_processed'] += 1 84 + results['new_entries'] += feed_result.get('new_entries', 0) 85 + except Exception as e: 86 + error_msg = f"Error syncing {feed_url}: {str(e)}" 87 + results['errors'].append(error_msg) 88 + results['feeds'][str(feed_url)] = {'error': error_msg} 89 + 90 + return results 91 + 92 + async def _sync_single_feed(self, username: str, feed_url: HttpUrl) -> dict: 93 + """Sync a single feed for a user.""" 94 + cache_key = self._get_cache_key(username, feed_url) 95 + last_modified = self._get_last_modified(cache_key) 96 + 97 + try: 98 + # Fetch feed content 99 + content = await self.feed_parser.fetch_feed(feed_url) 100 + 101 + # Parse feed 102 + feed_meta, entries = self.feed_parser.parse_feed(content, feed_url) 103 + 104 + # Filter new entries 105 + new_entries = [] 106 + for entry in entries: 107 + existing_entry = self.git_store.get_entry(username, entry.id) 108 + if not existing_entry: 109 + new_entries.append(entry) 110 + 111 + # Store new entries 112 + stored_count = 0 113 + for entry in new_entries: 114 + if self.git_store.store_entry(username, entry): 115 + stored_count += 1 116 + 117 + # Update cache 118 + self._update_cache(cache_key, { 119 + 'last_fetched': datetime.now().isoformat(), 120 + 'feed_meta': feed_meta.model_dump(exclude_none=True), 121 + 'entry_count': len(entries), 122 + 'new_entries': stored_count, 123 + 'feed_url': str(feed_url) 124 + }) 125 + 126 + return { 127 + 'success': True, 128 + 'total_entries': len(entries), 129 + 'new_entries': stored_count, 130 + 'feed_title': feed_meta.title, 131 + 'last_fetched': datetime.now().isoformat() 132 + } 133 + 134 + except Exception as e: 135 + return { 136 + 'success': False, 137 + 'error': str(e), 138 + 'feed_url': str(feed_url) 139 + } 140 + 141 + def get_entries(self, username: str, limit: Optional[int] = None) -> list[AtomEntry]: 142 + """Get entries for a user.""" 143 + return self.git_store.list_entries(username, limit) 144 + 145 + def get_entry(self, username: str, entry_id: str) -> Optional[AtomEntry]: 146 + """Get a specific entry.""" 147 + return self.git_store.get_entry(username, entry_id) 148 + 149 + def search_entries(self, query: str, username: Optional[str] = None, limit: Optional[int] = None) -> list[tuple[str, AtomEntry]]: 150 + """Search entries across users.""" 151 + return self.git_store.search_entries(query, username, limit) 152 + 153 + def get_stats(self) -> dict: 154 + """Get feed-related statistics.""" 155 + index = self.git_store._load_index() 156 + 157 + feed_stats = { 158 + 'total_feeds_configured': sum(len(user.feeds) for user in self.config.users), 159 + 'users_with_entries': len([u for u in index.users.values() if u.entry_count > 0]), 160 + 'cache_files': len(list(self.config.cache_dir.glob("*.json"))) if self.config.cache_dir.exists() else 0, 161 + } 162 + 163 + return feed_stats 164 + 165 + def _get_cache_key(self, username: str, feed_url: HttpUrl) -> str: 166 + """Generate cache key for feed.""" 167 + # Simple hash of username and feed URL 168 + import hashlib 169 + key_data = f"{username}:{str(feed_url)}" 170 + return hashlib.md5(key_data.encode()).hexdigest() 171 + 172 + def _get_last_modified(self, cache_key: str) -> Optional[datetime]: 173 + """Get last modified time from cache.""" 174 + cache_file = self.config.cache_dir / f"{cache_key}.json" 175 + if cache_file.exists(): 176 + try: 177 + with open(cache_file) as f: 178 + data = json.load(f) 179 + return datetime.fromisoformat(data.get('last_fetched', '')) 180 + except Exception: 181 + pass 182 + return None 183 + 184 + def _update_cache(self, cache_key: str, data: dict): 185 + """Update cache with feed data.""" 186 + cache_file = self.config.cache_dir / f"{cache_key}.json" 187 + try: 188 + with open(cache_file, 'w') as f: 189 + json.dump(data, f, indent=2) 190 + except Exception: 191 + # Cache update failure shouldn't break the sync 192 + pass 193 + 194 + def clear_cache(self, username: Optional[str] = None) -> bool: 195 + """Clear feed cache.""" 196 + try: 197 + if username: 198 + # Clear cache for specific user 199 + for user_config in self.config.users: 200 + if user_config.username == username: 201 + for feed_url in user_config.feeds: 202 + cache_key = self._get_cache_key(username, feed_url) 203 + cache_file = self.config.cache_dir / f"{cache_key}.json" 204 + if cache_file.exists(): 205 + cache_file.unlink() 206 + else: 207 + # Clear all cache 208 + if self.config.cache_dir.exists(): 209 + for cache_file in self.config.cache_dir.glob("*.json"): 210 + cache_file.unlink() 211 + return True 212 + except Exception: 213 + return False 214 + 215 + def get_feed_info(self, username: str, feed_url: str) -> Optional[dict]: 216 + """Get cached information about a specific feed.""" 217 + try: 218 + feed_url_obj = HttpUrl(feed_url) 219 + cache_key = self._get_cache_key(username, feed_url_obj) 220 + cache_file = self.config.cache_dir / f"{cache_key}.json" 221 + 222 + if cache_file.exists(): 223 + with open(cache_file) as f: 224 + return json.load(f) 225 + except Exception: 226 + pass 227 + return None

+304

src/thicket/subsystems/links.py

··· 1 + """Link processing subsystem.""" 2 + 3 + import json 4 + import re 5 + from collections import defaultdict 6 + from pathlib import Path 7 + from typing import Optional 8 + from urllib.parse import urljoin, urlparse 9 + 10 + from ..core.git_store import GitStore 11 + from ..models import AtomEntry, ThicketConfig 12 + 13 + 14 + class LinkProcessor: 15 + """Processes and manages links between entries.""" 16 + 17 + def __init__(self, git_store: GitStore, config: ThicketConfig): 18 + """Initialize link processor.""" 19 + self.git_store = git_store 20 + self.config = config 21 + self.links_file = self.git_store.repo_path / "links.json" 22 + 23 + def process_links(self, username: Optional[str] = None) -> dict: 24 + """Process and extract links from entries.""" 25 + if username: 26 + return self._process_user_links(username) 27 + 28 + # Process all users 29 + results = {} 30 + index = self.git_store._load_index() 31 + 32 + for user_metadata in index.users.values(): 33 + user_results = self._process_user_links(user_metadata.username) 34 + results[user_metadata.username] = user_results 35 + 36 + # Consolidate all links 37 + self._consolidate_links() 38 + 39 + return results 40 + 41 + def _process_user_links(self, username: str) -> dict: 42 + """Process links for a specific user.""" 43 + entries = self.git_store.list_entries(username) 44 + 45 + results = { 46 + 'username': username, 47 + 'entries_processed': 0, 48 + 'links_found': 0, 49 + 'external_links': 0, 50 + 'internal_links': 0, 51 + } 52 + 53 + links_data = self._load_links_data() 54 + 55 + for entry in entries: 56 + entry_links = self._extract_links_from_entry(entry) 57 + 58 + if entry_links: 59 + # Store links for this entry 60 + entry_key = f"{username}:{entry.id}" 61 + links_data[entry_key] = { 62 + 'entry_id': entry.id, 63 + 'username': username, 64 + 'title': entry.title, 65 + 'links': entry_links, 66 + 'processed_at': entry.updated.isoformat() if entry.updated else None, 67 + } 68 + 69 + results['links_found'] += len(entry_links) 70 + results['external_links'] += len([l for l in entry_links if self._is_external_link(l['url'])]) 71 + results['internal_links'] += len([l for l in entry_links if not self._is_external_link(l['url'])]) 72 + 73 + results['entries_processed'] += 1 74 + 75 + self._save_links_data(links_data) 76 + 77 + return results 78 + 79 + def _extract_links_from_entry(self, entry: AtomEntry) -> list[dict]: 80 + """Extract links from an entry's content.""" 81 + links = [] 82 + 83 + # Combine content and summary for link extraction 84 + text_content = "" 85 + if entry.content: 86 + text_content += entry.content 87 + if entry.summary: 88 + text_content += " " + entry.summary 89 + 90 + if not text_content: 91 + return links 92 + 93 + # Extract HTML links 94 + html_link_pattern = r'<a[^>]+href=["\']([^"\']+)["\'][^>]*>([^<]*)</a>' 95 + html_matches = re.findall(html_link_pattern, text_content, re.IGNORECASE) 96 + 97 + for url, text in html_matches: 98 + # Clean up the URL 99 + url = url.strip() 100 + text = text.strip() 101 + 102 + if url and url not in ['#', 'javascript:void(0)']: 103 + # Resolve relative URLs if possible 104 + if entry.link and url.startswith('/'): 105 + base_url = str(entry.link) 106 + parsed_base = urlparse(base_url) 107 + base_domain = f"{parsed_base.scheme}://{parsed_base.netloc}" 108 + url = urljoin(base_domain, url) 109 + 110 + links.append({ 111 + 'url': url, 112 + 'text': text or url, 113 + 'type': 'html' 114 + }) 115 + 116 + # Extract markdown links 117 + markdown_link_pattern = r'\[([^\]]*)\]$([^$]+)\)' 118 + markdown_matches = re.findall(markdown_link_pattern, text_content) 119 + 120 + for text, url in markdown_matches: 121 + url = url.strip() 122 + text = text.strip() 123 + 124 + if url and url not in ['#']: 125 + links.append({ 126 + 'url': url, 127 + 'text': text or url, 128 + 'type': 'markdown' 129 + }) 130 + 131 + # Extract plain URLs 132 + url_pattern = r'https?://[^\s<>"]+[^\s<>".,;!?]' 133 + url_matches = re.findall(url_pattern, text_content) 134 + 135 + for url in url_matches: 136 + # Skip if already found as HTML or markdown link 137 + if not any(link['url'] == url for link in links): 138 + links.append({ 139 + 'url': url, 140 + 'text': url, 141 + 'type': 'plain' 142 + }) 143 + 144 + return links 145 + 146 + def _is_external_link(self, url: str) -> bool: 147 + """Check if a link is external to the configured domains.""" 148 + try: 149 + parsed = urlparse(url) 150 + domain = parsed.netloc.lower() 151 + 152 + # Check against user domains from feeds 153 + for user_config in self.config.users: 154 + for feed_url in user_config.feeds: 155 + feed_domain = urlparse(str(feed_url)).netloc.lower() 156 + if domain == feed_domain or domain.endswith(f'.{feed_domain}'): 157 + return False 158 + 159 + # Check homepage domain 160 + if user_config.homepage: 161 + homepage_domain = urlparse(str(user_config.homepage)).netloc.lower() 162 + if domain == homepage_domain or domain.endswith(f'.{homepage_domain}'): 163 + return False 164 + 165 + return True 166 + except Exception: 167 + return True 168 + 169 + def _load_links_data(self) -> dict: 170 + """Load existing links data.""" 171 + if self.links_file.exists(): 172 + try: 173 + with open(self.links_file) as f: 174 + return json.load(f) 175 + except Exception: 176 + pass 177 + return {} 178 + 179 + def _save_links_data(self, links_data: dict): 180 + """Save links data to file.""" 181 + try: 182 + with open(self.links_file, 'w') as f: 183 + json.dump(links_data, f, indent=2, ensure_ascii=False) 184 + except Exception: 185 + # Link processing failure shouldn't break the main operation 186 + pass 187 + 188 + def _consolidate_links(self): 189 + """Consolidate and create reverse link mappings.""" 190 + links_data = self._load_links_data() 191 + 192 + # Create URL to entries mapping 193 + url_mapping = defaultdict(list) 194 + 195 + for entry_key, entry_data in links_data.items(): 196 + for link in entry_data.get('links', []): 197 + url_mapping[link['url']].append({ 198 + 'entry_key': entry_key, 199 + 'username': entry_data['username'], 200 + 'entry_id': entry_data['entry_id'], 201 + 'title': entry_data['title'], 202 + 'link_text': link['text'], 203 + 'link_type': link['type'], 204 + }) 205 + 206 + # Save URL mapping 207 + url_mapping_file = self.git_store.repo_path / "url_mapping.json" 208 + try: 209 + with open(url_mapping_file, 'w') as f: 210 + json.dump(dict(url_mapping), f, indent=2, ensure_ascii=False) 211 + except Exception: 212 + pass 213 + 214 + def get_links(self, username: Optional[str] = None) -> dict: 215 + """Get processed links.""" 216 + links_data = self._load_links_data() 217 + 218 + if username: 219 + user_links = {k: v for k, v in links_data.items() if v.get('username') == username} 220 + return user_links 221 + 222 + return links_data 223 + 224 + def find_references(self, url: str) -> list[tuple[str, AtomEntry]]: 225 + """Find entries that reference a URL.""" 226 + url_mapping_file = self.git_store.repo_path / "url_mapping.json" 227 + 228 + if not url_mapping_file.exists(): 229 + return [] 230 + 231 + try: 232 + with open(url_mapping_file) as f: 233 + url_mapping = json.load(f) 234 + 235 + references = url_mapping.get(url, []) 236 + results = [] 237 + 238 + for ref in references: 239 + entry = self.git_store.get_entry(ref['username'], ref['entry_id']) 240 + if entry: 241 + results.append((ref['username'], entry)) 242 + 243 + return results 244 + except Exception: 245 + return [] 246 + 247 + def get_stats(self) -> dict: 248 + """Get link processing statistics.""" 249 + links_data = self._load_links_data() 250 + 251 + total_entries_with_links = len(links_data) 252 + total_links = sum(len(entry_data.get('links', [])) for entry_data in links_data.values()) 253 + 254 + external_links = 0 255 + internal_links = 0 256 + 257 + for entry_data in links_data.values(): 258 + for link in entry_data.get('links', []): 259 + if self._is_external_link(link['url']): 260 + external_links += 1 261 + else: 262 + internal_links += 1 263 + 264 + # Count unique URLs 265 + unique_urls = set() 266 + for entry_data in links_data.values(): 267 + for link in entry_data.get('links', []): 268 + unique_urls.add(link['url']) 269 + 270 + return { 271 + 'entries_with_links': total_entries_with_links, 272 + 'total_links': total_links, 273 + 'unique_urls': len(unique_urls), 274 + 'external_links': external_links, 275 + 'internal_links': internal_links, 276 + } 277 + 278 + def get_most_referenced_urls(self, limit: int = 10) -> list[dict]: 279 + """Get most frequently referenced URLs.""" 280 + url_mapping_file = self.git_store.repo_path / "url_mapping.json" 281 + 282 + if not url_mapping_file.exists(): 283 + return [] 284 + 285 + try: 286 + with open(url_mapping_file) as f: 287 + url_mapping = json.load(f) 288 + 289 + # Count references per URL 290 + url_counts = [(url, len(refs)) for url, refs in url_mapping.items()] 291 + url_counts.sort(key=lambda x: x[1], reverse=True) 292 + 293 + results = [] 294 + for url, count in url_counts[:limit]: 295 + results.append({ 296 + 'url': url, 297 + 'reference_count': count, 298 + 'is_external': self._is_external_link(url), 299 + 'references': url_mapping[url] 300 + }) 301 + 302 + return results 303 + except Exception: 304 + return []

+158

src/thicket/subsystems/repository.py

··· 1 + """Repository management subsystem.""" 2 + 3 + import shutil 4 + from datetime import datetime 5 + from pathlib import Path 6 + from typing import Optional 7 + 8 + from ..core.git_store import GitStore 9 + from ..models import ThicketConfig 10 + 11 + 12 + class RepositoryManager: 13 + """Manages repository operations and metadata.""" 14 + 15 + def __init__(self, git_store: GitStore, config: ThicketConfig): 16 + """Initialize repository manager.""" 17 + self.git_store = git_store 18 + self.config = config 19 + 20 + def init_repository(self) -> bool: 21 + """Initialize the git repository if not already done.""" 22 + try: 23 + # GitStore.__init__ already handles repository initialization 24 + return True 25 + except Exception: 26 + return False 27 + 28 + def commit_changes(self, message: str) -> bool: 29 + """Commit all pending changes.""" 30 + try: 31 + self.git_store.commit_changes(message) 32 + return True 33 + except Exception: 34 + return False 35 + 36 + def get_status(self) -> dict: 37 + """Get repository status and statistics.""" 38 + try: 39 + stats = self.git_store.get_stats() 40 + 41 + # Add repository-specific information 42 + repo_status = { 43 + **stats, 44 + 'repository_path': str(self.config.git_store), 45 + 'cache_path': str(self.config.cache_dir), 46 + 'has_uncommitted_changes': self._has_uncommitted_changes(), 47 + 'last_commit': self._get_last_commit_info(), 48 + } 49 + 50 + return repo_status 51 + except Exception as e: 52 + return {'error': str(e)} 53 + 54 + def backup_repository(self, backup_path: Path) -> bool: 55 + """Create a backup of the repository.""" 56 + try: 57 + if backup_path.exists(): 58 + shutil.rmtree(backup_path) 59 + 60 + shutil.copytree(self.config.git_store, backup_path) 61 + return True 62 + except Exception: 63 + return False 64 + 65 + def cleanup_cache(self) -> bool: 66 + """Clean up cache directory.""" 67 + try: 68 + if self.config.cache_dir.exists(): 69 + shutil.rmtree(self.config.cache_dir) 70 + self.config.cache_dir.mkdir(parents=True, exist_ok=True) 71 + return True 72 + except Exception: 73 + return False 74 + 75 + def get_repository_size(self) -> dict: 76 + """Get detailed repository size information.""" 77 + try: 78 + total_size = 0 79 + file_count = 0 80 + dir_count = 0 81 + 82 + for path in self.config.git_store.rglob("*"): 83 + if path.is_file(): 84 + total_size += path.stat().st_size 85 + file_count += 1 86 + elif path.is_dir(): 87 + dir_count += 1 88 + 89 + return { 90 + 'total_size_bytes': total_size, 91 + 'total_size_mb': round(total_size / (1024 * 1024), 2), 92 + 'file_count': file_count, 93 + 'directory_count': dir_count, 94 + } 95 + except Exception as e: 96 + return {'error': str(e)} 97 + 98 + def _has_uncommitted_changes(self) -> bool: 99 + """Check if there are uncommitted changes.""" 100 + try: 101 + if not self.git_store.repo: 102 + return False 103 + return bool(self.git_store.repo.index.diff("HEAD") or self.git_store.repo.untracked_files) 104 + except Exception: 105 + return False 106 + 107 + def _get_last_commit_info(self) -> Optional[dict]: 108 + """Get information about the last commit.""" 109 + try: 110 + if not self.git_store.repo: 111 + return None 112 + 113 + last_commit = self.git_store.repo.head.commit 114 + return { 115 + 'hash': last_commit.hexsha[:8], 116 + 'message': last_commit.message.strip(), 117 + 'author': str(last_commit.author), 118 + 'date': datetime.fromtimestamp(last_commit.committed_date).isoformat(), 119 + } 120 + except Exception: 121 + return None 122 + 123 + def verify_integrity(self) -> dict: 124 + """Verify repository integrity.""" 125 + issues = [] 126 + 127 + # Check if git repository is valid 128 + try: 129 + if not self.git_store.repo: 130 + issues.append("Git repository not initialized") 131 + except Exception as e: 132 + issues.append(f"Git repository error: {e}") 133 + 134 + # Check if index.json exists and is valid 135 + index_path = self.config.git_store / "index.json" 136 + if not index_path.exists(): 137 + issues.append("index.json missing") 138 + else: 139 + try: 140 + self.git_store._load_index() 141 + except Exception as e: 142 + issues.append(f"index.json corrupted: {e}") 143 + 144 + # Check if duplicates.json exists 145 + duplicates_path = self.config.git_store / "duplicates.json" 146 + if not duplicates_path.exists(): 147 + issues.append("duplicates.json missing") 148 + else: 149 + try: 150 + self.git_store._load_duplicates() 151 + except Exception as e: 152 + issues.append(f"duplicates.json corrupted: {e}") 153 + 154 + return { 155 + 'is_valid': len(issues) == 0, 156 + 'issues': issues, 157 + 'checked_at': datetime.now().isoformat(), 158 + }

+319

src/thicket/subsystems/site.py

··· 1 + """Site generation subsystem.""" 2 + 3 + import json 4 + import shutil 5 + from datetime import datetime 6 + from pathlib import Path 7 + from typing import Optional 8 + 9 + from jinja2 import Environment, FileSystemLoader, select_autoescape 10 + 11 + from ..core.git_store import GitStore 12 + from ..models import ThicketConfig 13 + 14 + 15 + class SiteGenerator: 16 + """Generates static sites from stored entries.""" 17 + 18 + def __init__(self, git_store: GitStore, config: ThicketConfig): 19 + """Initialize site generator.""" 20 + self.git_store = git_store 21 + self.config = config 22 + self.default_template_dir = Path(__file__).parent.parent / "templates" 23 + 24 + def generate_site(self, output_dir: Path, template_dir: Optional[Path] = None) -> bool: 25 + """Generate complete static site.""" 26 + try: 27 + # Setup template environment 28 + template_dir = template_dir or self.default_template_dir 29 + if not template_dir.exists(): 30 + return False 31 + 32 + env = Environment( 33 + loader=FileSystemLoader(str(template_dir)), 34 + autoescape=select_autoescape(['html', 'xml']) 35 + ) 36 + 37 + # Prepare output directory 38 + output_dir.mkdir(parents=True, exist_ok=True) 39 + 40 + # Copy static assets 41 + self._copy_static_assets(template_dir, output_dir) 42 + 43 + # Generate pages 44 + self._generate_index_page(env, output_dir) 45 + self._generate_timeline_page(env, output_dir) 46 + self._generate_users_page(env, output_dir) 47 + self._generate_links_page(env, output_dir) 48 + self._generate_user_detail_pages(env, output_dir) 49 + 50 + return True 51 + except Exception: 52 + return False 53 + 54 + def generate_timeline(self, output_path: Path, limit: Optional[int] = None) -> bool: 55 + """Generate timeline HTML file.""" 56 + try: 57 + env = Environment( 58 + loader=FileSystemLoader(str(self.default_template_dir)), 59 + autoescape=select_autoescape(['html', 'xml']) 60 + ) 61 + 62 + timeline_data = self._get_timeline_data(limit) 63 + template = env.get_template('timeline.html') 64 + 65 + content = template.render(**timeline_data) 66 + 67 + output_path.parent.mkdir(parents=True, exist_ok=True) 68 + with open(output_path, 'w', encoding='utf-8') as f: 69 + f.write(content) 70 + 71 + return True 72 + except Exception: 73 + return False 74 + 75 + def generate_user_pages(self, output_dir: Path) -> bool: 76 + """Generate individual user pages.""" 77 + try: 78 + env = Environment( 79 + loader=FileSystemLoader(str(self.default_template_dir)), 80 + autoescape=select_autoescape(['html', 'xml']) 81 + ) 82 + 83 + return self._generate_user_detail_pages(env, output_dir) 84 + except Exception: 85 + return False 86 + 87 + def _copy_static_assets(self, template_dir: Path, output_dir: Path): 88 + """Copy CSS, JS, and other static assets.""" 89 + static_files = ['style.css', 'script.js'] 90 + 91 + for filename in static_files: 92 + src_file = template_dir / filename 93 + if src_file.exists(): 94 + dst_file = output_dir / filename 95 + shutil.copy2(src_file, dst_file) 96 + 97 + def _generate_index_page(self, env: Environment, output_dir: Path): 98 + """Generate main index page.""" 99 + template = env.get_template('index.html') 100 + 101 + # Get summary statistics 102 + stats = self.git_store.get_stats() 103 + index = self.git_store._load_index() 104 + 105 + # Recent entries 106 + recent_entries = [] 107 + for username in index.users.keys(): 108 + user_entries = self.git_store.list_entries(username, limit=5) 109 + for entry in user_entries: 110 + recent_entries.append({ 111 + 'username': username, 112 + 'entry': entry 113 + }) 114 + 115 + # Sort by date 116 + recent_entries.sort(key=lambda x: x['entry'].updated or x['entry'].published, reverse=True) 117 + recent_entries = recent_entries[:10] 118 + 119 + context = { 120 + 'title': 'Thicket Feed Archive', 121 + 'stats': stats, 122 + 'recent_entries': recent_entries, 123 + 'users': list(index.users.values()), 124 + 'generated_at': datetime.now().isoformat(), 125 + } 126 + 127 + content = template.render(**context) 128 + 129 + with open(output_dir / 'index.html', 'w', encoding='utf-8') as f: 130 + f.write(content) 131 + 132 + def _generate_timeline_page(self, env: Environment, output_dir: Path): 133 + """Generate timeline page.""" 134 + template = env.get_template('timeline.html') 135 + timeline_data = self._get_timeline_data() 136 + 137 + content = template.render(**timeline_data) 138 + 139 + with open(output_dir / 'timeline.html', 'w', encoding='utf-8') as f: 140 + f.write(content) 141 + 142 + def _generate_users_page(self, env: Environment, output_dir: Path): 143 + """Generate users overview page.""" 144 + template = env.get_template('users.html') 145 + 146 + index = self.git_store._load_index() 147 + users_data = [] 148 + 149 + for user_metadata in index.users.values(): 150 + # Get user config for additional details 151 + user_config = next( 152 + (u for u in self.config.users if u.username == user_metadata.username), 153 + None 154 + ) 155 + 156 + # Get recent entries 157 + recent_entries = self.git_store.list_entries(user_metadata.username, limit=3) 158 + 159 + users_data.append({ 160 + 'metadata': user_metadata, 161 + 'config': user_config, 162 + 'recent_entries': recent_entries, 163 + }) 164 + 165 + # Sort by entry count 166 + users_data.sort(key=lambda x: x['metadata'].entry_count, reverse=True) 167 + 168 + context = { 169 + 'title': 'Users', 170 + 'users': users_data, 171 + 'generated_at': datetime.now().isoformat(), 172 + } 173 + 174 + content = template.render(**context) 175 + 176 + with open(output_dir / 'users.html', 'w', encoding='utf-8') as f: 177 + f.write(content) 178 + 179 + def _generate_links_page(self, env: Environment, output_dir: Path): 180 + """Generate links overview page.""" 181 + template = env.get_template('links.html') 182 + 183 + # Load links data 184 + links_file = self.git_store.repo_path / "links.json" 185 + url_mapping_file = self.git_store.repo_path / "url_mapping.json" 186 + 187 + links_data = {} 188 + url_mapping = {} 189 + 190 + if links_file.exists(): 191 + try: 192 + with open(links_file) as f: 193 + links_data = json.load(f) 194 + except Exception: 195 + pass 196 + 197 + if url_mapping_file.exists(): 198 + try: 199 + with open(url_mapping_file) as f: 200 + url_mapping = json.load(f) 201 + except Exception: 202 + pass 203 + 204 + # Process most referenced URLs 205 + url_counts = [(url, len(refs)) for url, refs in url_mapping.items()] 206 + url_counts.sort(key=lambda x: x[1], reverse=True) 207 + most_referenced = url_counts[:20] 208 + 209 + # Count links by type 210 + link_stats = { 211 + 'total_entries_with_links': len(links_data), 212 + 'total_links': sum(len(entry_data.get('links', [])) for entry_data in links_data.values()), 213 + 'unique_urls': len(url_mapping), 214 + } 215 + 216 + context = { 217 + 'title': 'Links', 218 + 'most_referenced': most_referenced, 219 + 'url_mapping': url_mapping, 220 + 'link_stats': link_stats, 221 + 'generated_at': datetime.now().isoformat(), 222 + } 223 + 224 + content = template.render(**context) 225 + 226 + with open(output_dir / 'links.html', 'w', encoding='utf-8') as f: 227 + f.write(content) 228 + 229 + def _generate_user_detail_pages(self, env: Environment, output_dir: Path) -> bool: 230 + """Generate individual user detail pages.""" 231 + try: 232 + template = env.get_template('user_detail.html') 233 + index = self.git_store._load_index() 234 + 235 + # Create users subdirectory 236 + users_dir = output_dir / 'users' 237 + users_dir.mkdir(exist_ok=True) 238 + 239 + for user_metadata in index.users.values(): 240 + user_config = next( 241 + (u for u in self.config.users if u.username == user_metadata.username), 242 + None 243 + ) 244 + 245 + entries = self.git_store.list_entries(user_metadata.username) 246 + 247 + # Get user's links 248 + links_file = self.git_store.repo_path / "links.json" 249 + user_links = [] 250 + if links_file.exists(): 251 + try: 252 + with open(links_file) as f: 253 + all_links = json.load(f) 254 + user_links = [ 255 + data for key, data in all_links.items() 256 + if data.get('username') == user_metadata.username 257 + ] 258 + except Exception: 259 + pass 260 + 261 + context = { 262 + 'title': f"{user_metadata.display_name or user_metadata.username}", 263 + 'user_metadata': user_metadata, 264 + 'user_config': user_config, 265 + 'entries': entries, 266 + 'user_links': user_links, 267 + 'generated_at': datetime.now().isoformat(), 268 + } 269 + 270 + content = template.render(**context) 271 + 272 + user_file = users_dir / f"{user_metadata.username}.html" 273 + with open(user_file, 'w', encoding='utf-8') as f: 274 + f.write(content) 275 + 276 + return True 277 + except Exception: 278 + return False 279 + 280 + def _get_timeline_data(self, limit: Optional[int] = None) -> dict: 281 + """Get data for timeline page.""" 282 + index = self.git_store._load_index() 283 + 284 + # Collect all entries with metadata 285 + all_entries = [] 286 + for user_metadata in index.users.values(): 287 + user_entries = self.git_store.list_entries(user_metadata.username) 288 + for entry in user_entries: 289 + all_entries.append({ 290 + 'username': user_metadata.username, 291 + 'display_name': user_metadata.display_name, 292 + 'entry': entry, 293 + }) 294 + 295 + # Sort by date (newest first) 296 + all_entries.sort( 297 + key=lambda x: x['entry'].updated or x['entry'].published or datetime.min, 298 + reverse=True 299 + ) 300 + 301 + if limit: 302 + all_entries = all_entries[:limit] 303 + 304 + # Group by date for timeline display 305 + timeline_groups = {} 306 + for item in all_entries: 307 + entry_date = item['entry'].updated or item['entry'].published 308 + if entry_date: 309 + date_key = entry_date.strftime('%Y-%m-%d') 310 + if date_key not in timeline_groups: 311 + timeline_groups[date_key] = [] 312 + timeline_groups[date_key].append(item) 313 + 314 + return { 315 + 'title': 'Timeline', 316 + 'timeline_groups': timeline_groups, 317 + 'total_entries': len(all_entries), 318 + 'generated_at': datetime.now().isoformat(), 319 + }

+254

src/thicket/subsystems/users.py

··· 1 + """User management subsystem.""" 2 + 3 + import shutil 4 + from typing import Optional 5 + 6 + from pydantic import EmailStr, HttpUrl, ValidationError 7 + 8 + from ..core.git_store import GitStore 9 + from ..models import ThicketConfig, UserConfig, UserMetadata 10 + 11 + 12 + class UserManager: 13 + """Manages user operations and metadata.""" 14 + 15 + def __init__(self, git_store: GitStore, config: ThicketConfig): 16 + """Initialize user manager.""" 17 + self.git_store = git_store 18 + self.config = config 19 + 20 + def add_user(self, username: str, feeds: list[str], **kwargs) -> UserConfig: 21 + """Add a new user with feeds.""" 22 + # Validate feeds 23 + validated_feeds = [] 24 + for feed in feeds: 25 + try: 26 + validated_feeds.append(HttpUrl(feed)) 27 + except ValidationError as e: 28 + raise ValueError(f"Invalid feed URL '{feed}': {e}") 29 + 30 + # Validate optional fields 31 + email = None 32 + if 'email' in kwargs and kwargs['email']: 33 + try: 34 + email = EmailStr(kwargs['email']) 35 + except ValidationError as e: 36 + raise ValueError(f"Invalid email '{kwargs['email']}': {e}") 37 + 38 + homepage = None 39 + if 'homepage' in kwargs and kwargs['homepage']: 40 + try: 41 + homepage = HttpUrl(kwargs['homepage']) 42 + except ValidationError as e: 43 + raise ValueError(f"Invalid homepage URL '{kwargs['homepage']}': {e}") 44 + 45 + icon = None 46 + if 'icon' in kwargs and kwargs['icon']: 47 + try: 48 + icon = HttpUrl(kwargs['icon']) 49 + except ValidationError as e: 50 + raise ValueError(f"Invalid icon URL '{kwargs['icon']}': {e}") 51 + 52 + # Create user config 53 + user_config = UserConfig( 54 + username=username, 55 + feeds=validated_feeds, 56 + email=email, 57 + homepage=homepage, 58 + icon=icon, 59 + display_name=kwargs.get('display_name') 60 + ) 61 + 62 + # Add to git store 63 + self.git_store.add_user( 64 + username=username, 65 + display_name=user_config.display_name, 66 + email=str(user_config.email) if user_config.email else None, 67 + homepage=str(user_config.homepage) if user_config.homepage else None, 68 + icon=str(user_config.icon) if user_config.icon else None, 69 + feeds=[str(feed) for feed in user_config.feeds] 70 + ) 71 + 72 + # Add to config if not already present 73 + existing_user = next((u for u in self.config.users if u.username == username), None) 74 + if not existing_user: 75 + self.config.users.append(user_config) 76 + else: 77 + # Update existing config 78 + existing_user.feeds = user_config.feeds 79 + existing_user.email = user_config.email 80 + existing_user.homepage = user_config.homepage 81 + existing_user.icon = user_config.icon 82 + existing_user.display_name = user_config.display_name 83 + 84 + return user_config 85 + 86 + def get_user(self, username: str) -> Optional[UserConfig]: 87 + """Get user configuration.""" 88 + return next((u for u in self.config.users if u.username == username), None) 89 + 90 + def get_user_metadata(self, username: str) -> Optional[UserMetadata]: 91 + """Get user metadata from git store.""" 92 + return self.git_store.get_user(username) 93 + 94 + def list_users(self) -> list[UserConfig]: 95 + """List all configured users.""" 96 + return self.config.users.copy() 97 + 98 + def list_users_with_metadata(self) -> list[tuple[UserConfig, Optional[UserMetadata]]]: 99 + """List users with their git store metadata.""" 100 + result = [] 101 + for user_config in self.config.users: 102 + metadata = self.git_store.get_user(user_config.username) 103 + result.append((user_config, metadata)) 104 + return result 105 + 106 + def update_user(self, username: str, **kwargs) -> bool: 107 + """Update user configuration.""" 108 + # Update in config 109 + user_config = self.get_user(username) 110 + if not user_config: 111 + return False 112 + 113 + # Validate and update feeds if provided 114 + if 'feeds' in kwargs: 115 + validated_feeds = [] 116 + for feed in kwargs['feeds']: 117 + try: 118 + validated_feeds.append(HttpUrl(feed)) 119 + except ValidationError: 120 + return False 121 + user_config.feeds = validated_feeds 122 + 123 + # Validate and update other fields 124 + if 'email' in kwargs and kwargs['email']: 125 + try: 126 + user_config.email = EmailStr(kwargs['email']) 127 + except ValidationError: 128 + return False 129 + elif 'email' in kwargs and not kwargs['email']: 130 + user_config.email = None 131 + 132 + if 'homepage' in kwargs and kwargs['homepage']: 133 + try: 134 + user_config.homepage = HttpUrl(kwargs['homepage']) 135 + except ValidationError: 136 + return False 137 + elif 'homepage' in kwargs and not kwargs['homepage']: 138 + user_config.homepage = None 139 + 140 + if 'icon' in kwargs and kwargs['icon']: 141 + try: 142 + user_config.icon = HttpUrl(kwargs['icon']) 143 + except ValidationError: 144 + return False 145 + elif 'icon' in kwargs and not kwargs['icon']: 146 + user_config.icon = None 147 + 148 + if 'display_name' in kwargs: 149 + user_config.display_name = kwargs['display_name'] or None 150 + 151 + # Update in git store 152 + git_kwargs = {} 153 + if 'feeds' in kwargs: 154 + git_kwargs['feeds'] = [str(feed) for feed in user_config.feeds] 155 + if user_config.email: 156 + git_kwargs['email'] = str(user_config.email) 157 + if user_config.homepage: 158 + git_kwargs['homepage'] = str(user_config.homepage) 159 + if user_config.icon: 160 + git_kwargs['icon'] = str(user_config.icon) 161 + if user_config.display_name: 162 + git_kwargs['display_name'] = user_config.display_name 163 + 164 + return self.git_store.update_user(username, **git_kwargs) 165 + 166 + def remove_user(self, username: str) -> bool: 167 + """Remove a user and their data.""" 168 + # Remove from config 169 + self.config.users = [u for u in self.config.users if u.username != username] 170 + 171 + # Remove user directory from git store 172 + user_metadata = self.git_store.get_user(username) 173 + if user_metadata: 174 + user_dir = self.git_store.repo_path / user_metadata.directory 175 + if user_dir.exists(): 176 + try: 177 + shutil.rmtree(user_dir) 178 + except Exception: 179 + return False 180 + 181 + # Remove user from index 182 + index = self.git_store._load_index() 183 + if username in index.users: 184 + del index.users[username] 185 + self.git_store._save_index(index) 186 + 187 + return True 188 + 189 + def get_user_stats(self, username: str) -> Optional[dict]: 190 + """Get statistics for a specific user.""" 191 + user_metadata = self.git_store.get_user(username) 192 + if not user_metadata: 193 + return None 194 + 195 + user_config = self.get_user(username) 196 + entries = self.git_store.list_entries(username) 197 + 198 + return { 199 + 'username': username, 200 + 'display_name': user_metadata.display_name, 201 + 'entry_count': user_metadata.entry_count, 202 + 'feeds_configured': len(user_config.feeds) if user_config else 0, 203 + 'directory': user_metadata.directory, 204 + 'created': user_metadata.created.isoformat() if user_metadata.created else None, 205 + 'last_updated': user_metadata.last_updated.isoformat() if user_metadata.last_updated else None, 206 + 'latest_entry': entries[0].updated.isoformat() if entries else None, 207 + } 208 + 209 + def validate_user_feeds(self, username: str) -> dict: 210 + """Validate all feeds for a user.""" 211 + user_config = self.get_user(username) 212 + if not user_config: 213 + return {'error': 'User not found'} 214 + 215 + results = { 216 + 'username': username, 217 + 'total_feeds': len(user_config.feeds), 218 + 'valid_feeds': [], 219 + 'invalid_feeds': [], 220 + } 221 + 222 + for feed_url in user_config.feeds: 223 + try: 224 + # Basic URL validation - more comprehensive validation would require fetching 225 + HttpUrl(str(feed_url)) 226 + results['valid_feeds'].append(str(feed_url)) 227 + except ValidationError as e: 228 + results['invalid_feeds'].append({ 229 + 'url': str(feed_url), 230 + 'error': str(e) 231 + }) 232 + 233 + results['is_valid'] = len(results['invalid_feeds']) == 0 234 + 235 + return results 236 + 237 + def sync_config_with_git_store(self) -> bool: 238 + """Sync configuration users with git store.""" 239 + try: 240 + for user_config in self.config.users: 241 + git_user = self.git_store.get_user(user_config.username) 242 + if not git_user: 243 + # Add missing user to git store 244 + self.git_store.add_user( 245 + username=user_config.username, 246 + display_name=user_config.display_name, 247 + email=str(user_config.email) if user_config.email else None, 248 + homepage=str(user_config.homepage) if user_config.homepage else None, 249 + icon=str(user_config.icon) if user_config.icon else None, 250 + feeds=[str(feed) for feed in user_config.feeds] 251 + ) 252 + return True 253 + except Exception: 254 + return False

+31

src/thicket/templates/base.html

··· 1 + <!DOCTYPE html> 2 + <html lang="en"> 3 + <head> 4 + <meta charset="UTF-8"> 5 + <meta name="viewport" content="width=device-width, initial-scale=1.0"> 6 + <title>{% block page_title %}{{ title }}{% endblock %}</title> 7 + <link rel="stylesheet" href="css/style.css"> 8 + </head> 9 + <body> 10 + <header class="site-header"> 11 + <div class="header-content"> 12 + <h1 class="site-title">{{ title }}</h1> 13 + <nav class="site-nav"> 14 + <a href="timeline.html" class="nav-link {% if page == 'timeline' %}active{% endif %}">Timeline</a> 15 + <a href="links.html" class="nav-link {% if page == 'links' %}active{% endif %}">Links</a> 16 + <a href="users.html" class="nav-link {% if page == 'users' %}active{% endif %}">Users</a> 17 + </nav> 18 + </div> 19 + </header> 20 + 21 + <main class="main-content"> 22 + {% block content %}{% endblock %} 23 + </main> 24 + 25 + <footer class="site-footer"> 26 + <p>Generated on {{ generated_at }} by <a href="https://github.com/avsm/thicket">Thicket</a></p> 27 + </footer> 28 + 29 + <script src="js/script.js"></script> 30 + </body> 31 + </html>

+13

src/thicket/templates/index.html

··· 1 + <!DOCTYPE html> 2 + <html lang="en"> 3 + <head> 4 + <meta charset="UTF-8"> 5 + <meta name="viewport" content="width=device-width, initial-scale=1.0"> 6 + <title>{{ title }}</title> 7 + <meta http-equiv="refresh" content="0; url=timeline.html"> 8 + <link rel="canonical" href="timeline.html"> 9 + </head> 10 + <body> 11 + <p>Redirecting to <a href="timeline.html">Timeline</a>...</p> 12 + </body> 13 + </html>

+38

src/thicket/templates/links.html

··· 1 + {% extends "base.html" %} 2 + 3 + {% block page_title %}Outgoing Links - {{ title }}{% endblock %} 4 + 5 + {% block content %} 6 + <div class="page-content"> 7 + <h2>Outgoing Links</h2> 8 + <p class="page-description">External links referenced in blog posts, ordered by most recent reference.</p> 9 + 10 + {% for link in outgoing_links %} 11 + <article class="link-group"> 12 + <h3 class="link-url"> 13 + <a href="{{ link.url }}" target="_blank">{{ link.url|truncate(80) }}</a> 14 + {% if link.target_username %} 15 + <span class="target-user">({{ link.target_username }})</span> 16 + {% endif %} 17 + </h3> 18 + <div class="referencing-entries"> 19 + <span class="ref-count">Referenced in {{ link.entries|length }} post(s):</span> 20 + <ul> 21 + {% for display_name, entry in link.entries[:5] %} 22 + <li> 23 + <span class="author">{{ display_name }}</span> - 24 + <a href="{{ entry.link }}" target="_blank">{{ entry.title }}</a> 25 + <time datetime="{{ entry.updated or entry.published }}"> 26 + ({{ (entry.updated or entry.published).strftime('%Y-%m-%d') }}) 27 + </time> 28 + </li> 29 + {% endfor %} 30 + {% if link.entries|length > 5 %} 31 + <li class="more">... and {{ link.entries|length - 5 }} more</li> 32 + {% endif %} 33 + </ul> 34 + </div> 35 + </article> 36 + {% endfor %} 37 + </div> 38 + {% endblock %}

+88

src/thicket/templates/script.js

··· 1 + // Enhanced functionality for thicket website 2 + document.addEventListener('DOMContentLoaded', function() { 3 + 4 + // Enhance thread collapsing (optional feature) 5 + const threadHeaders = document.querySelectorAll('.thread-header'); 6 + threadHeaders.forEach(header => { 7 + header.style.cursor = 'pointer'; 8 + header.addEventListener('click', function() { 9 + const thread = this.parentElement; 10 + const entries = thread.querySelectorAll('.thread-entry'); 11 + 12 + // Toggle visibility of all but the first entry 13 + for (let i = 1; i < entries.length; i++) { 14 + entries[i].style.display = entries[i].style.display === 'none' ? 'block' : 'none'; 15 + } 16 + 17 + // Update thread count text 18 + const count = this.querySelector('.thread-count'); 19 + if (entries[1] && entries[1].style.display === 'none') { 20 + count.textContent = count.textContent.replace('posts', 'posts (collapsed)'); 21 + } else { 22 + count.textContent = count.textContent.replace(' (collapsed)', ''); 23 + } 24 + }); 25 + }); 26 + 27 + // Add relative time display 28 + const timeElements = document.querySelectorAll('time'); 29 + timeElements.forEach(timeEl => { 30 + const datetime = new Date(timeEl.getAttribute('datetime')); 31 + const now = new Date(); 32 + const diffMs = now - datetime; 33 + const diffDays = Math.floor(diffMs / (1000 * 60 * 60 * 24)); 34 + 35 + let relativeTime; 36 + if (diffDays === 0) { 37 + const diffHours = Math.floor(diffMs / (1000 * 60 * 60)); 38 + if (diffHours === 0) { 39 + const diffMinutes = Math.floor(diffMs / (1000 * 60)); 40 + relativeTime = diffMinutes === 0 ? 'just now' : `${diffMinutes}m ago`; 41 + } else { 42 + relativeTime = `${diffHours}h ago`; 43 + } 44 + } else if (diffDays === 1) { 45 + relativeTime = 'yesterday'; 46 + } else if (diffDays < 7) { 47 + relativeTime = `${diffDays}d ago`; 48 + } else if (diffDays < 30) { 49 + const weeks = Math.floor(diffDays / 7); 50 + relativeTime = weeks === 1 ? '1w ago' : `${weeks}w ago`; 51 + } else if (diffDays < 365) { 52 + const months = Math.floor(diffDays / 30); 53 + relativeTime = months === 1 ? '1mo ago' : `${months}mo ago`; 54 + } else { 55 + const years = Math.floor(diffDays / 365); 56 + relativeTime = years === 1 ? '1y ago' : `${years}y ago`; 57 + } 58 + 59 + // Add relative time as title attribute 60 + timeEl.setAttribute('title', timeEl.textContent); 61 + timeEl.textContent = relativeTime; 62 + }); 63 + 64 + // Enhanced anchor link scrolling for shared references 65 + document.querySelectorAll('a[href^="#"]').forEach(anchor => { 66 + anchor.addEventListener('click', function (e) { 67 + e.preventDefault(); 68 + const target = document.querySelector(this.getAttribute('href')); 69 + if (target) { 70 + target.scrollIntoView({ 71 + behavior: 'smooth', 72 + block: 'center' 73 + }); 74 + 75 + // Highlight the target briefly 76 + const timelineEntry = target.closest('.timeline-entry'); 77 + if (timelineEntry) { 78 + timelineEntry.style.outline = '2px solid var(--primary-color)'; 79 + timelineEntry.style.borderRadius = '8px'; 80 + setTimeout(() => { 81 + timelineEntry.style.outline = ''; 82 + timelineEntry.style.borderRadius = ''; 83 + }, 2000); 84 + } 85 + } 86 + }); 87 + }); 88 + });

+754

src/thicket/templates/style.css

··· 1 + /* Modern, clean design with high-density text and readable theme */ 2 + 3 + :root { 4 + --primary-color: #2c3e50; 5 + --secondary-color: #3498db; 6 + --accent-color: #e74c3c; 7 + --background: #ffffff; 8 + --surface: #f8f9fa; 9 + --text-primary: #2c3e50; 10 + --text-secondary: #7f8c8d; 11 + --border-color: #e0e0e0; 12 + --thread-indent: 20px; 13 + --max-width: 1200px; 14 + } 15 + 16 + * { 17 + margin: 0; 18 + padding: 0; 19 + box-sizing: border-box; 20 + } 21 + 22 + body { 23 + font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', 'Helvetica Neue', Arial, sans-serif; 24 + font-size: 14px; 25 + line-height: 1.6; 26 + color: var(--text-primary); 27 + background-color: var(--background); 28 + } 29 + 30 + /* Header */ 31 + .site-header { 32 + background-color: var(--surface); 33 + border-bottom: 1px solid var(--border-color); 34 + padding: 0.75rem 0; 35 + position: sticky; 36 + top: 0; 37 + z-index: 100; 38 + } 39 + 40 + .header-content { 41 + max-width: var(--max-width); 42 + margin: 0 auto; 43 + padding: 0 2rem; 44 + display: flex; 45 + justify-content: space-between; 46 + align-items: center; 47 + } 48 + 49 + .site-title { 50 + font-size: 1.5rem; 51 + font-weight: 600; 52 + color: var(--primary-color); 53 + margin: 0; 54 + } 55 + 56 + /* Navigation */ 57 + .site-nav { 58 + display: flex; 59 + gap: 1.5rem; 60 + } 61 + 62 + .nav-link { 63 + text-decoration: none; 64 + color: var(--text-secondary); 65 + font-weight: 500; 66 + font-size: 0.95rem; 67 + padding: 0.5rem 0.75rem; 68 + border-radius: 4px; 69 + transition: all 0.2s ease; 70 + } 71 + 72 + .nav-link:hover { 73 + color: var(--primary-color); 74 + background-color: var(--background); 75 + } 76 + 77 + .nav-link.active { 78 + color: var(--secondary-color); 79 + background-color: var(--background); 80 + font-weight: 600; 81 + } 82 + 83 + /* Main Content */ 84 + .main-content { 85 + max-width: var(--max-width); 86 + margin: 2rem auto; 87 + padding: 0 2rem; 88 + } 89 + 90 + .page-content { 91 + margin: 0; 92 + } 93 + 94 + .page-description { 95 + color: var(--text-secondary); 96 + margin-bottom: 1.5rem; 97 + font-style: italic; 98 + } 99 + 100 + /* Sections */ 101 + section { 102 + margin-bottom: 2rem; 103 + } 104 + 105 + h2 { 106 + font-size: 1.3rem; 107 + font-weight: 600; 108 + margin-bottom: 0.75rem; 109 + color: var(--primary-color); 110 + } 111 + 112 + h3 { 113 + font-size: 1.1rem; 114 + font-weight: 600; 115 + margin-bottom: 0.75rem; 116 + color: var(--primary-color); 117 + } 118 + 119 + /* Entries and Threads */ 120 + article { 121 + margin-bottom: 1.5rem; 122 + padding: 1rem; 123 + background-color: var(--surface); 124 + border-radius: 4px; 125 + border: 1px solid var(--border-color); 126 + } 127 + 128 + /* Timeline-style entries */ 129 + .timeline-entry { 130 + margin-bottom: 0.5rem; 131 + padding: 0.5rem 0.75rem; 132 + border: none; 133 + background: transparent; 134 + transition: background-color 0.2s ease; 135 + } 136 + 137 + .timeline-entry:hover { 138 + background-color: var(--surface); 139 + } 140 + 141 + .timeline-meta { 142 + display: inline-flex; 143 + gap: 0.5rem; 144 + align-items: center; 145 + font-size: 0.75rem; 146 + color: var(--text-secondary); 147 + margin-bottom: 0.25rem; 148 + } 149 + 150 + .timeline-time { 151 + font-family: 'SF Mono', Monaco, Consolas, 'Courier New', monospace; 152 + font-size: 0.75rem; 153 + color: var(--text-secondary); 154 + } 155 + 156 + .timeline-author { 157 + font-weight: 600; 158 + color: var(--primary-color); 159 + font-size: 0.8rem; 160 + text-decoration: none; 161 + } 162 + 163 + .timeline-author:hover { 164 + color: var(--secondary-color); 165 + text-decoration: underline; 166 + } 167 + 168 + .timeline-content { 169 + line-height: 1.4; 170 + } 171 + 172 + .timeline-title { 173 + font-size: 0.95rem; 174 + font-weight: 600; 175 + } 176 + 177 + .timeline-title a { 178 + color: var(--primary-color); 179 + text-decoration: none; 180 + } 181 + 182 + .timeline-title a:hover { 183 + color: var(--secondary-color); 184 + text-decoration: underline; 185 + } 186 + 187 + .timeline-summary { 188 + color: var(--text-secondary); 189 + font-size: 0.9rem; 190 + line-height: 1.4; 191 + } 192 + 193 + /* Legacy styles for other sections */ 194 + .entry-meta, .thread-header { 195 + display: flex; 196 + gap: 1rem; 197 + align-items: center; 198 + margin-bottom: 0.5rem; 199 + font-size: 0.85rem; 200 + color: var(--text-secondary); 201 + } 202 + 203 + .author { 204 + font-weight: 600; 205 + color: var(--primary-color); 206 + } 207 + 208 + time { 209 + font-size: 0.85rem; 210 + } 211 + 212 + h4 { 213 + font-size: 1.1rem; 214 + font-weight: 600; 215 + margin-bottom: 0.5rem; 216 + } 217 + 218 + h4 a { 219 + color: var(--primary-color); 220 + text-decoration: none; 221 + } 222 + 223 + h4 a:hover { 224 + color: var(--secondary-color); 225 + text-decoration: underline; 226 + } 227 + 228 + .entry-summary { 229 + color: var(--text-primary); 230 + line-height: 1.5; 231 + margin-top: 0.5rem; 232 + } 233 + 234 + /* Enhanced Threading Styles */ 235 + 236 + /* Conversation Clusters */ 237 + .conversation-cluster { 238 + background-color: var(--background); 239 + border: 2px solid var(--border-color); 240 + border-radius: 8px; 241 + margin-bottom: 2rem; 242 + overflow: hidden; 243 + box-shadow: 0 2px 4px rgba(0, 0, 0, 0.05); 244 + } 245 + 246 + .conversation-header { 247 + background: linear-gradient(135deg, var(--surface) 0%, #f1f3f4 100%); 248 + padding: 0.75rem 1rem; 249 + border-bottom: 1px solid var(--border-color); 250 + } 251 + 252 + .conversation-meta { 253 + display: flex; 254 + justify-content: space-between; 255 + align-items: center; 256 + flex-wrap: wrap; 257 + gap: 0.5rem; 258 + } 259 + 260 + .conversation-count { 261 + font-weight: 600; 262 + color: var(--secondary-color); 263 + font-size: 0.9rem; 264 + } 265 + 266 + .conversation-participants { 267 + font-size: 0.8rem; 268 + color: var(--text-secondary); 269 + flex: 1; 270 + text-align: right; 271 + } 272 + 273 + .conversation-flow { 274 + padding: 0.5rem; 275 + } 276 + 277 + /* Threaded Conversation Entries */ 278 + .conversation-entry { 279 + position: relative; 280 + margin-bottom: 0.75rem; 281 + display: flex; 282 + align-items: flex-start; 283 + } 284 + 285 + .conversation-entry.level-0 { 286 + margin-left: 0; 287 + } 288 + 289 + .conversation-entry.level-1 { 290 + margin-left: 1.5rem; 291 + } 292 + 293 + .conversation-entry.level-2 { 294 + margin-left: 3rem; 295 + } 296 + 297 + .conversation-entry.level-3 { 298 + margin-left: 4.5rem; 299 + } 300 + 301 + .conversation-entry.level-4 { 302 + margin-left: 6rem; 303 + } 304 + 305 + .entry-connector { 306 + width: 3px; 307 + background-color: var(--secondary-color); 308 + margin-right: 0.75rem; 309 + margin-top: 0.25rem; 310 + min-height: 2rem; 311 + border-radius: 2px; 312 + opacity: 0.6; 313 + } 314 + 315 + .conversation-entry.level-0 .entry-connector { 316 + background-color: var(--accent-color); 317 + opacity: 0.8; 318 + } 319 + 320 + .entry-content { 321 + flex: 1; 322 + background-color: var(--surface); 323 + padding: 0.75rem; 324 + border-radius: 6px; 325 + border: 1px solid var(--border-color); 326 + transition: all 0.2s ease; 327 + } 328 + 329 + .entry-content:hover { 330 + border-color: var(--secondary-color); 331 + box-shadow: 0 2px 8px rgba(52, 152, 219, 0.1); 332 + } 333 + 334 + /* Reference Indicators */ 335 + .reference-indicators { 336 + display: inline-flex; 337 + gap: 0.25rem; 338 + margin-left: 0.5rem; 339 + } 340 + 341 + .ref-out, .ref-in { 342 + display: inline-block; 343 + width: 1rem; 344 + height: 1rem; 345 + border-radius: 50%; 346 + text-align: center; 347 + line-height: 1rem; 348 + font-size: 0.7rem; 349 + font-weight: bold; 350 + } 351 + 352 + .ref-out { 353 + background-color: #e8f5e8; 354 + color: #2d8f2d; 355 + } 356 + 357 + .ref-in { 358 + background-color: #e8f0ff; 359 + color: #1f5fbf; 360 + } 361 + 362 + /* Reference Badges for Individual Posts */ 363 + .timeline-entry.with-references { 364 + background-color: var(--surface); 365 + } 366 + 367 + /* Conversation posts in unified timeline */ 368 + .timeline-entry.conversation-post { 369 + background: transparent; 370 + border: none; 371 + margin-bottom: 0.5rem; 372 + padding: 0.5rem 0.75rem; 373 + } 374 + 375 + .timeline-entry.conversation-post.level-0 { 376 + margin-left: 0; 377 + border-left: 2px solid var(--accent-color); 378 + padding-left: 0.75rem; 379 + } 380 + 381 + .timeline-entry.conversation-post.level-1 { 382 + margin-left: 1.5rem; 383 + border-left: 2px solid var(--secondary-color); 384 + padding-left: 0.75rem; 385 + } 386 + 387 + .timeline-entry.conversation-post.level-2 { 388 + margin-left: 3rem; 389 + border-left: 2px solid var(--text-secondary); 390 + padding-left: 0.75rem; 391 + } 392 + 393 + .timeline-entry.conversation-post.level-3 { 394 + margin-left: 4.5rem; 395 + border-left: 2px solid var(--text-secondary); 396 + padding-left: 0.75rem; 397 + } 398 + 399 + .timeline-entry.conversation-post.level-4 { 400 + margin-left: 6rem; 401 + border-left: 2px solid var(--text-secondary); 402 + padding-left: 0.75rem; 403 + } 404 + 405 + /* Cross-thread linking */ 406 + .cross-thread-links { 407 + margin-top: 0.5rem; 408 + padding-top: 0.5rem; 409 + border-top: 1px solid var(--border-color); 410 + } 411 + 412 + .cross-thread-indicator { 413 + font-size: 0.75rem; 414 + color: var(--text-secondary); 415 + background-color: var(--surface); 416 + padding: 0.25rem 0.5rem; 417 + border-radius: 12px; 418 + border: 1px solid var(--border-color); 419 + display: inline-block; 420 + } 421 + 422 + /* Inline shared references styling */ 423 + .inline-shared-refs { 424 + margin-left: 0.5rem; 425 + font-size: 0.85rem; 426 + color: var(--text-secondary); 427 + } 428 + 429 + .shared-ref-link { 430 + color: var(--primary-color); 431 + text-decoration: none; 432 + font-weight: 500; 433 + transition: color 0.2s ease; 434 + } 435 + 436 + .shared-ref-link:hover { 437 + color: var(--secondary-color); 438 + text-decoration: underline; 439 + } 440 + 441 + .shared-ref-more { 442 + font-style: italic; 443 + color: var(--text-secondary); 444 + font-size: 0.8rem; 445 + margin-left: 0.25rem; 446 + } 447 + 448 + .user-anchor, .post-anchor { 449 + position: absolute; 450 + margin-top: -60px; /* Offset for fixed header */ 451 + pointer-events: none; 452 + } 453 + 454 + .cross-thread-link { 455 + color: var(--primary-color); 456 + text-decoration: none; 457 + font-weight: 500; 458 + transition: color 0.2s ease; 459 + } 460 + 461 + .cross-thread-link:hover { 462 + color: var(--secondary-color); 463 + text-decoration: underline; 464 + } 465 + 466 + .reference-badges { 467 + display: flex; 468 + gap: 0.25rem; 469 + margin-left: 0.5rem; 470 + flex-wrap: wrap; 471 + } 472 + 473 + .ref-badge { 474 + display: inline-block; 475 + padding: 0.1rem 0.4rem; 476 + border-radius: 12px; 477 + font-size: 0.7rem; 478 + font-weight: 600; 479 + text-transform: uppercase; 480 + letter-spacing: 0.05em; 481 + } 482 + 483 + .ref-badge.ref-outbound { 484 + background-color: #e8f5e8; 485 + color: #2d8f2d; 486 + border: 1px solid #c3e6c3; 487 + } 488 + 489 + .ref-badge.ref-inbound { 490 + background-color: #e8f0ff; 491 + color: #1f5fbf; 492 + border: 1px solid #b3d9ff; 493 + } 494 + 495 + /* Author Color Coding */ 496 + .timeline-author { 497 + position: relative; 498 + } 499 + 500 + .timeline-author::before { 501 + content: ''; 502 + display: inline-block; 503 + width: 8px; 504 + height: 8px; 505 + border-radius: 50%; 506 + margin-right: 0.5rem; 507 + background-color: var(--secondary-color); 508 + } 509 + 510 + /* Generate consistent colors for authors */ 511 + .author-avsm::before { background-color: #e74c3c; } 512 + .author-mort::before { background-color: #3498db; } 513 + .author-mte::before { background-color: #2ecc71; } 514 + .author-ryan::before { background-color: #f39c12; } 515 + .author-mwd::before { background-color: #9b59b6; } 516 + .author-dra::before { background-color: #1abc9c; } 517 + .author-pf341::before { background-color: #34495e; } 518 + .author-sadiqj::before { background-color: #e67e22; } 519 + .author-martinkl::before { background-color: #8e44ad; } 520 + .author-jonsterling::before { background-color: #27ae60; } 521 + .author-jon::before { background-color: #f1c40f; } 522 + .author-onkar::before { background-color: #e91e63; } 523 + .author-gabriel::before { background-color: #00bcd4; } 524 + .author-jess::before { background-color: #ff5722; } 525 + .author-ibrahim::before { background-color: #607d8b; } 526 + .author-andres::before { background-color: #795548; } 527 + .author-eeg::before { background-color: #ff9800; } 528 + 529 + /* Section Headers */ 530 + .conversations-section h3, 531 + .referenced-posts-section h3, 532 + .individual-posts-section h3 { 533 + border-bottom: 2px solid var(--border-color); 534 + padding-bottom: 0.5rem; 535 + margin-bottom: 1.5rem; 536 + position: relative; 537 + } 538 + 539 + .conversations-section h3::before { 540 + content: "💬"; 541 + margin-right: 0.5rem; 542 + } 543 + 544 + .referenced-posts-section h3::before { 545 + content: "🔗"; 546 + margin-right: 0.5rem; 547 + } 548 + 549 + .individual-posts-section h3::before { 550 + content: "📝"; 551 + margin-right: 0.5rem; 552 + } 553 + 554 + /* Legacy thread styles (for backward compatibility) */ 555 + .thread { 556 + background-color: var(--background); 557 + border: 1px solid var(--border-color); 558 + padding: 0; 559 + overflow: hidden; 560 + margin-bottom: 1rem; 561 + } 562 + 563 + .thread-header { 564 + background-color: var(--surface); 565 + padding: 0.5rem 0.75rem; 566 + border-bottom: 1px solid var(--border-color); 567 + } 568 + 569 + .thread-count { 570 + font-weight: 600; 571 + color: var(--secondary-color); 572 + } 573 + 574 + .thread-entry { 575 + padding: 0.5rem 0.75rem; 576 + border-bottom: 1px solid var(--border-color); 577 + } 578 + 579 + .thread-entry:last-child { 580 + border-bottom: none; 581 + } 582 + 583 + .thread-entry.reply { 584 + margin-left: var(--thread-indent); 585 + border-left: 3px solid var(--secondary-color); 586 + background-color: var(--surface); 587 + } 588 + 589 + /* Links Section */ 590 + .link-group { 591 + background-color: var(--background); 592 + } 593 + 594 + .link-url { 595 + font-size: 1rem; 596 + word-break: break-word; 597 + } 598 + 599 + .link-url a { 600 + color: var(--secondary-color); 601 + text-decoration: none; 602 + } 603 + 604 + .link-url a:hover { 605 + text-decoration: underline; 606 + } 607 + 608 + .target-user { 609 + font-size: 0.9rem; 610 + color: var(--text-secondary); 611 + font-weight: normal; 612 + } 613 + 614 + .referencing-entries { 615 + margin-top: 0.75rem; 616 + } 617 + 618 + .ref-count { 619 + font-weight: 600; 620 + color: var(--text-secondary); 621 + font-size: 0.9rem; 622 + } 623 + 624 + .referencing-entries ul { 625 + list-style: none; 626 + margin-top: 0.5rem; 627 + padding-left: 1rem; 628 + } 629 + 630 + .referencing-entries li { 631 + margin-bottom: 0.25rem; 632 + font-size: 0.9rem; 633 + } 634 + 635 + .referencing-entries .more { 636 + font-style: italic; 637 + color: var(--text-secondary); 638 + } 639 + 640 + /* Users Section */ 641 + .user-card { 642 + background-color: var(--background); 643 + } 644 + 645 + .user-header { 646 + display: flex; 647 + gap: 1rem; 648 + align-items: start; 649 + margin-bottom: 1rem; 650 + } 651 + 652 + .user-icon { 653 + width: 48px; 654 + height: 48px; 655 + border-radius: 50%; 656 + object-fit: cover; 657 + } 658 + 659 + .user-info h3 { 660 + margin-bottom: 0.25rem; 661 + } 662 + 663 + .username { 664 + font-size: 0.9rem; 665 + color: var(--text-secondary); 666 + font-weight: normal; 667 + } 668 + 669 + .user-meta { 670 + font-size: 0.9rem; 671 + color: var(--text-secondary); 672 + } 673 + 674 + .user-meta a { 675 + color: var(--secondary-color); 676 + text-decoration: none; 677 + } 678 + 679 + .user-meta a:hover { 680 + text-decoration: underline; 681 + } 682 + 683 + .separator { 684 + margin: 0 0.5rem; 685 + } 686 + 687 + .post-count { 688 + font-weight: 600; 689 + } 690 + 691 + .user-recent h4 { 692 + font-size: 0.95rem; 693 + margin-bottom: 0.5rem; 694 + color: var(--text-secondary); 695 + } 696 + 697 + .user-recent ul { 698 + list-style: none; 699 + padding-left: 0; 700 + } 701 + 702 + .user-recent li { 703 + margin-bottom: 0.25rem; 704 + font-size: 0.9rem; 705 + } 706 + 707 + /* Footer */ 708 + .site-footer { 709 + max-width: var(--max-width); 710 + margin: 3rem auto 2rem; 711 + padding: 1rem 2rem; 712 + text-align: center; 713 + color: var(--text-secondary); 714 + font-size: 0.85rem; 715 + border-top: 1px solid var(--border-color); 716 + } 717 + 718 + .site-footer a { 719 + color: var(--secondary-color); 720 + text-decoration: none; 721 + } 722 + 723 + .site-footer a:hover { 724 + text-decoration: underline; 725 + } 726 + 727 + /* Responsive */ 728 + @media (max-width: 768px) { 729 + .site-title { 730 + font-size: 1.3rem; 731 + } 732 + 733 + .header-content { 734 + flex-direction: column; 735 + gap: 0.75rem; 736 + align-items: flex-start; 737 + } 738 + 739 + .site-nav { 740 + gap: 1rem; 741 + } 742 + 743 + .main-content { 744 + padding: 0 1rem; 745 + } 746 + 747 + .thread-entry.reply { 748 + margin-left: calc(var(--thread-indent) / 2); 749 + } 750 + 751 + .user-header { 752 + flex-direction: column; 753 + } 754 + }

+141

src/thicket/templates/timeline.html

··· 1 + {% extends "base.html" %} 2 + 3 + {% block page_title %}Timeline - {{ title }}{% endblock %} 4 + 5 + {% block content %} 6 + {% set seen_users = [] %} 7 + <div class="page-content"> 8 + <h2>Recent Posts & Conversations</h2> 9 + 10 + <section class="unified-timeline"> 11 + {% for item in timeline_items %} 12 + {% if item.type == "post" %} 13 +  14 + <article class="timeline-entry {% if item.content.references %}with-references{% endif %}"> 15 + <div class="timeline-meta"> 16 + <time datetime="{{ item.content.entry.updated or item.content.entry.published }}" class="timeline-time"> 17 + {{ (item.content.entry.updated or item.content.entry.published).strftime('%Y-%m-%d %H:%M') }} 18 + </time> 19 + {% set homepage = get_user_homepage(item.content.username) %} 20 + {% if item.content.username not in seen_users %} 21 + <a id="{{ item.content.username }}" class="user-anchor"></a> 22 + {% set _ = seen_users.append(item.content.username) %} 23 + {% endif %} 24 + <a id="post-{{ loop.index0 }}-{{ safe_anchor_id(item.content.entry.id) }}" class="post-anchor"></a> 25 + {% if homepage %} 26 + <a href="{{ homepage }}" target="_blank" class="timeline-author">{{ item.content.display_name }}</a> 27 + {% else %} 28 + <span class="timeline-author">{{ item.content.display_name }}</span> 29 + {% endif %} 30 + {% if item.content.references %} 31 + <div class="reference-badges"> 32 + {% for ref in item.content.references %} 33 + {% if ref.type == 'outbound' %} 34 + <span class="ref-badge ref-outbound" title="References {{ ref.target_username or 'external post' }}"> 35 + → {{ ref.target_username or 'ext' }} 36 + </span> 37 + {% elif ref.type == 'inbound' %} 38 + <span class="ref-badge ref-inbound" title="Referenced by {{ ref.source_username or 'external post' }}"> 39 + ← {{ ref.source_username or 'ext' }} 40 + </span> 41 + {% endif %} 42 + {% endfor %} 43 + </div> 44 + {% endif %} 45 + </div> 46 + <div class="timeline-content"> 47 + <strong class="timeline-title"> 48 + <a href="{{ item.content.entry.link }}" target="_blank">{{ item.content.entry.title }}</a> 49 + </strong> 50 + {% if item.content.entry.summary %} 51 + <span class="timeline-summary">— {{ clean_html_summary(item.content.entry.summary, 250) }}</span> 52 + {% endif %} 53 + {% if item.content.shared_references %} 54 + <span class="inline-shared-refs"> 55 + {% for ref in item.content.shared_references[:3] %} 56 + {% if ref.target_username %} 57 + <a href="#{{ ref.target_username }}" class="shared-ref-link" title="Referenced by {{ ref.count }} entries">@{{ ref.target_username }}</a>{% if not loop.last %}, {% endif %} 58 + {% endif %} 59 + {% endfor %} 60 + {% if item.content.shared_references|length > 3 %} 61 + <span class="shared-ref-more">+{{ item.content.shared_references|length - 3 }} more</span> 62 + {% endif %} 63 + </span> 64 + {% endif %} 65 + {% if item.content.cross_thread_links %} 66 + <div class="cross-thread-links"> 67 + <span class="cross-thread-indicator">🔗 Also appears: </span> 68 + {% for link in item.content.cross_thread_links %} 69 + <a href="#{{ link.anchor_id }}" class="cross-thread-link" title="{{ link.title }}">{{ link.context }}</a>{% if not loop.last %}, {% endif %} 70 + {% endfor %} 71 + </div> 72 + {% endif %} 73 + </div> 74 + </article> 75 + 76 + {% elif item.type == "thread" %} 77 +  78 + {% set outer_loop_index = loop.index0 %} 79 + {% for thread_item in item.content %} 80 + <article class="timeline-entry conversation-post level-{{ thread_item.thread_level }}"> 81 + <div class="timeline-meta"> 82 + <time datetime="{{ thread_item.entry.updated or thread_item.entry.published }}" class="timeline-time"> 83 + {{ (thread_item.entry.updated or thread_item.entry.published).strftime('%Y-%m-%d %H:%M') }} 84 + </time> 85 + {% set homepage = get_user_homepage(thread_item.username) %} 86 + {% if thread_item.username not in seen_users %} 87 + <a id="{{ thread_item.username }}" class="user-anchor"></a> 88 + {% set _ = seen_users.append(thread_item.username) %} 89 + {% endif %} 90 + <a id="post-{{ outer_loop_index }}-{{ loop.index0 }}-{{ safe_anchor_id(thread_item.entry.id) }}" class="post-anchor"></a> 91 + {% if homepage %} 92 + <a href="{{ homepage }}" target="_blank" class="timeline-author author-{{ thread_item.username }}">{{ thread_item.display_name }}</a> 93 + {% else %} 94 + <span class="timeline-author author-{{ thread_item.username }}">{{ thread_item.display_name }}</span> 95 + {% endif %} 96 + {% if thread_item.references_to or thread_item.referenced_by %} 97 + <span class="reference-indicators"> 98 + {% if thread_item.references_to %} 99 + <span class="ref-out" title="References other posts">→</span> 100 + {% endif %} 101 + {% if thread_item.referenced_by %} 102 + <span class="ref-in" title="Referenced by other posts">←</span> 103 + {% endif %} 104 + </span> 105 + {% endif %} 106 + </div> 107 + <div class="timeline-content"> 108 + <strong class="timeline-title"> 109 + <a href="{{ thread_item.entry.link }}" target="_blank">{{ thread_item.entry.title }}</a> 110 + </strong> 111 + {% if thread_item.entry.summary %} 112 + <span class="timeline-summary">— {{ clean_html_summary(thread_item.entry.summary, 300) }}</span> 113 + {% endif %} 114 + {% if thread_item.shared_references %} 115 + <span class="inline-shared-refs"> 116 + {% for ref in thread_item.shared_references[:3] %} 117 + {% if ref.target_username %} 118 + <a href="#{{ ref.target_username }}" class="shared-ref-link" title="Referenced by {{ ref.count }} entries">@{{ ref.target_username }}</a>{% if not loop.last %}, {% endif %} 119 + {% endif %} 120 + {% endfor %} 121 + {% if thread_item.shared_references|length > 3 %} 122 + <span class="shared-ref-more">+{{ thread_item.shared_references|length - 3 }} more</span> 123 + {% endif %} 124 + </span> 125 + {% endif %} 126 + {% if thread_item.cross_thread_links %} 127 + <div class="cross-thread-links"> 128 + <span class="cross-thread-indicator">🔗 Also appears: </span> 129 + {% for link in thread_item.cross_thread_links %} 130 + <a href="#{{ link.anchor_id }}" class="cross-thread-link" title="{{ link.title }}">{{ link.context }}</a>{% if not loop.last %}, {% endif %} 131 + {% endfor %} 132 + </div> 133 + {% endif %} 134 + </div> 135 + </article> 136 + {% endfor %} 137 + {% endif %} 138 + {% endfor %} 139 + </section> 140 + </div> 141 + {% endblock %}

+169

src/thicket/templates/user_detail.html

··· 1 + {% extends "base.html" %} 2 + 3 + {% block title %}{{ title }} - Thicket{% endblock %} 4 + 5 + {% block content %} 6 + <div class="container mx-auto px-4 py-8"> 7 + <div class="max-w-4xl mx-auto"> 8 +  9 + <div class="bg-white rounded-lg shadow-md p-6 mb-6"> 10 + <div class="flex items-center space-x-4"> 11 + {% if user_config and user_config.icon %} 12 + <img src="{{ user_config.icon }}" alt="{{ title }}" class="w-16 h-16 rounded-full"> 13 + {% else %} 14 + <div class="w-16 h-16 rounded-full bg-blue-500 flex items-center justify-center text-white text-xl font-bold"> 15 + {{ user_metadata.username[0].upper() }} 16 + </div> 17 + {% endif %} 18 + 19 + <div> 20 + <h1 class="text-2xl font-bold text-gray-900">{{ title }}</h1> 21 + <p class="text-gray-600">@{{ user_metadata.username }}</p> 22 + {% if user_config and user_config.email %} 23 + <p class="text-sm text-gray-500">{{ user_config.email }}</p> 24 + {% endif %} 25 + </div> 26 + </div> 27 + 28 + {% if user_config and user_config.homepage %} 29 + <div class="mt-4"> 30 + <a href="{{ user_config.homepage }}" class="text-blue-600 hover:text-blue-800" target="_blank"> 31 + 🏠 Homepage 32 + </a> 33 + </div> 34 + {% endif %} 35 + 36 + <div class="mt-4 grid grid-cols-2 md:grid-cols-4 gap-4"> 37 + <div class="text-center"> 38 + <div class="text-2xl font-bold text-blue-600">{{ user_metadata.entry_count }}</div> 39 + <div class="text-sm text-gray-500">Entries</div> 40 + </div> 41 + 42 + {% if user_config %} 43 + <div class="text-center"> 44 + <div class="text-2xl font-bold text-green-600">{{ user_config.feeds|length }}</div> 45 + <div class="text-sm text-gray-500">Feeds</div> 46 + </div> 47 + {% endif %} 48 + 49 + <div class="text-center"> 50 + <div class="text-2xl font-bold text-purple-600">{{ user_links|length }}</div> 51 + <div class="text-sm text-gray-500">Link Groups</div> 52 + </div> 53 + 54 + <div class="text-center"> 55 + <div class="text-sm text-gray-500">Member since</div> 56 + <div class="text-sm font-medium">{{ user_metadata.created.strftime('%Y-%m-%d') if user_metadata.created else 'Unknown' }}</div> 57 + </div> 58 + </div> 59 + </div> 60 + 61 +  62 + {% if user_config and user_config.feeds %} 63 + <div class="bg-white rounded-lg shadow-md p-6 mb-6"> 64 + <h2 class="text-xl font-semibold mb-4">Feeds</h2> 65 + <div class="space-y-2"> 66 + {% for feed in user_config.feeds %} 67 + <div class="flex items-center space-x-2"> 68 + <span class="text-green-500">📡</span> 69 + <a href="{{ feed }}" class="text-blue-600 hover:text-blue-800" target="_blank">{{ feed }}</a> 70 + </div> 71 + {% endfor %} 72 + </div> 73 + </div> 74 + {% endif %} 75 + 76 +  77 + <div class="bg-white rounded-lg shadow-md p-6 mb-6"> 78 + <h2 class="text-xl font-semibold mb-4">Recent Entries</h2> 79 + 80 + {% if entries %} 81 + <div class="space-y-4"> 82 + {% for entry in entries[:10] %} 83 + <div class="border-l-4 border-blue-500 pl-4 py-2"> 84 + <h3 class="font-semibold text-lg"> 85 + <a href="{{ entry.link }}" class="text-blue-600 hover:text-blue-800" target="_blank"> 86 + {{ entry.title }} 87 + </a> 88 + </h3> 89 + 90 + <div class="text-sm text-gray-500 mb-2"> 91 + {% if entry.published %} 92 + Published: {{ entry.published.strftime('%Y-%m-%d %H:%M') }} 93 + {% endif %} 94 + {% if entry.updated and entry.updated != entry.published %} 95 + • Updated: {{ entry.updated.strftime('%Y-%m-%d %H:%M') }} 96 + {% endif %} 97 + </div> 98 + 99 + {% if entry.summary %} 100 + <div class="text-gray-700 mb-2"> 101 + {{ entry.summary|truncate(200) }} 102 + </div> 103 + {% endif %} 104 + 105 + {% if entry.categories %} 106 + <div class="flex flex-wrap gap-1"> 107 + {% for category in entry.categories %} 108 + <span class="px-2 py-1 bg-blue-100 text-blue-800 text-xs rounded">{{ category }}</span> 109 + {% endfor %} 110 + </div> 111 + {% endif %} 112 + </div> 113 + {% endfor %} 114 + </div> 115 + 116 + {% if entries|length > 10 %} 117 + <div class="mt-4 text-center"> 118 + <p class="text-gray-500">Showing 10 of {{ entries|length }} entries</p> 119 + </div> 120 + {% endif %} 121 + 122 + {% else %} 123 + <p class="text-gray-500">No entries found.</p> 124 + {% endif %} 125 + </div> 126 + 127 +  128 + {% if user_links %} 129 + <div class="bg-white rounded-lg shadow-md p-6"> 130 + <h2 class="text-xl font-semibold mb-4">Link Activity</h2> 131 + 132 + <div class="space-y-3"> 133 + {% for link_group in user_links[:5] %} 134 + <div class="border-l-4 border-green-500 pl-4"> 135 + <h3 class="font-medium">{{ link_group.title }}</h3> 136 + <div class="text-sm text-gray-500 mb-2"> 137 + {{ link_group.links|length }} link(s) found 138 + </div> 139 + 140 + <div class="space-y-1"> 141 + {% for link in link_group.links[:3] %} 142 + <div class="text-sm"> 143 + <a href="{{ link.url }}" class="text-blue-600 hover:text-blue-800" target="_blank"> 144 + {{ link.text or link.url }} 145 + </a> 146 + <span class="text-gray-400 ml-2">({{ link.type }})</span> 147 + </div> 148 + {% endfor %} 149 + 150 + {% if link_group.links|length > 3 %} 151 + <div class="text-sm text-gray-500"> 152 + ... and {{ link_group.links|length - 3 }} more 153 + </div> 154 + {% endif %} 155 + </div> 156 + </div> 157 + {% endfor %} 158 + </div> 159 + 160 + {% if user_links|length > 5 %} 161 + <div class="mt-4 text-center"> 162 + <p class="text-gray-500">Showing 5 of {{ user_links|length }} entries with links</p> 163 + </div> 164 + {% endif %} 165 + </div> 166 + {% endif %} 167 + </div> 168 + </div> 169 + {% endblock %}

+57

src/thicket/templates/users.html

··· 1 + {% extends "base.html" %} 2 + 3 + {% block page_title %}Users - {{ title }}{% endblock %} 4 + 5 + {% block content %} 6 + <div class="page-content"> 7 + <h2>Users</h2> 8 + <p class="page-description">All users contributing to this thicket, ordered by post count.</p> 9 + 10 + {% for user_info in users %} 11 + <article class="user-card"> 12 + <div class="user-header"> 13 + {% if user_info.metadata.icon and user_info.metadata.icon != "None" %} 14 + <img src="{{ user_info.metadata.icon }}" alt="{{ user_info.metadata.username }}" class="user-icon"> 15 + {% endif %} 16 + <div class="user-info"> 17 + <h3> 18 + {% if user_info.metadata.display_name %} 19 + {{ user_info.metadata.display_name }} 20 + <span class="username">({{ user_info.metadata.username }})</span> 21 + {% else %} 22 + {{ user_info.metadata.username }} 23 + {% endif %} 24 + </h3> 25 + <div class="user-meta"> 26 + {% if user_info.metadata.homepage %} 27 + <a href="{{ user_info.metadata.homepage }}" target="_blank">{{ user_info.metadata.homepage }}</a> 28 + {% endif %} 29 + {% if user_info.metadata.email %} 30 + <span class="separator">•</span> 31 + <a href="mailto:{{ user_info.metadata.email }}">{{ user_info.metadata.email }}</a> 32 + {% endif %} 33 + <span class="separator">•</span> 34 + <span class="post-count">{{ user_info.metadata.entry_count }} posts</span> 35 + </div> 36 + </div> 37 + </div> 38 + 39 + {% if user_info.recent_entries %} 40 + <div class="user-recent"> 41 + <h4>Recent posts:</h4> 42 + <ul> 43 + {% for display_name, entry in user_info.recent_entries %} 44 + <li> 45 + <a href="{{ entry.link }}" target="_blank">{{ entry.title }}</a> 46 + <time datetime="{{ entry.updated or entry.published }}"> 47 + ({{ (entry.updated or entry.published).strftime('%Y-%m-%d') }}) 48 + </time> 49 + </li> 50 + {% endfor %} 51 + </ul> 52 + </div> 53 + {% endif %} 54 + </article> 55 + {% endfor %} 56 + </div> 57 + {% endblock %}

+230

src/thicket/thicket.py

··· 1 + """Main Thicket library class providing unified API.""" 2 + 3 + import asyncio 4 + from datetime import datetime 5 + from pathlib import Path 6 + from typing import Optional, Union 7 + 8 + from pydantic import HttpUrl 9 + 10 + from .core.feed_parser import FeedParser 11 + from .core.git_store import GitStore 12 + from .models import AtomEntry, ThicketConfig, UserConfig 13 + from .subsystems.feeds import FeedManager 14 + from .subsystems.links import LinkProcessor 15 + from .subsystems.repository import RepositoryManager 16 + from .subsystems.site import SiteGenerator 17 + from .subsystems.users import UserManager 18 + 19 + 20 + class Thicket: 21 + """ 22 + Main Thicket class providing unified API for feed management. 23 + 24 + This class serves as the primary interface for all Thicket operations, 25 + consolidating configuration, repository management, feed processing, 26 + user management, link processing, and site generation. 27 + """ 28 + 29 + def __init__(self, config: Union[ThicketConfig, Path, str]): 30 + """ 31 + Initialize Thicket with configuration. 32 + 33 + Args: 34 + config: Either a ThicketConfig object, or path to config file 35 + """ 36 + if isinstance(config, (Path, str)): 37 + self.config = ThicketConfig.from_file(Path(config)) 38 + else: 39 + self.config = config 40 + 41 + # Initialize subsystems 42 + self._init_subsystems() 43 + 44 + def _init_subsystems(self): 45 + """Initialize all subsystems.""" 46 + # Core components 47 + self.git_store = GitStore(self.config.git_store) 48 + self.feed_parser = FeedParser() 49 + 50 + # Subsystem managers 51 + self.repository = RepositoryManager(self.git_store, self.config) 52 + self.users = UserManager(self.git_store, self.config) 53 + self.feeds = FeedManager(self.git_store, self.feed_parser, self.config) 54 + self.links = LinkProcessor(self.git_store, self.config) 55 + self.site = SiteGenerator(self.git_store, self.config) 56 + 57 + @classmethod 58 + def create(cls, git_store: Path, cache_dir: Path, users: Optional[list[UserConfig]] = None) -> 'Thicket': 59 + """ 60 + Create a new Thicket instance with minimal configuration. 61 + 62 + Args: 63 + git_store: Path to git repository 64 + cache_dir: Path to cache directory 65 + users: Optional list of user configurations 66 + 67 + Returns: 68 + Configured Thicket instance 69 + """ 70 + config = ThicketConfig( 71 + git_store=git_store, 72 + cache_dir=cache_dir, 73 + users=users or [] 74 + ) 75 + return cls(config) 76 + 77 + @classmethod 78 + def from_config_file(cls, config_path: Path) -> 'Thicket': 79 + """Load Thicket from configuration file.""" 80 + return cls(config_path) 81 + 82 + # User Management API 83 + def add_user(self, username: str, feeds: list[str], **kwargs) -> UserConfig: 84 + """Add a new user with feeds.""" 85 + return self.users.add_user(username, feeds, **kwargs) 86 + 87 + def get_user(self, username: str) -> Optional[UserConfig]: 88 + """Get user configuration.""" 89 + return self.users.get_user(username) 90 + 91 + def list_users(self) -> list[UserConfig]: 92 + """List all configured users.""" 93 + return self.users.list_users() 94 + 95 + def update_user(self, username: str, **kwargs) -> bool: 96 + """Update user configuration.""" 97 + return self.users.update_user(username, **kwargs) 98 + 99 + def remove_user(self, username: str) -> bool: 100 + """Remove a user and their data.""" 101 + return self.users.remove_user(username) 102 + 103 + # Feed Management API 104 + async def sync_feeds(self, username: Optional[str] = None, progress_callback=None) -> dict: 105 + """Sync feeds for user(s).""" 106 + return await self.feeds.sync_feeds(username, progress_callback) 107 + 108 + async def sync_user_feeds(self, username: str, progress_callback=None) -> dict: 109 + """Sync feeds for a specific user.""" 110 + return await self.feeds.sync_user_feeds(username, progress_callback) 111 + 112 + def get_entries(self, username: str, limit: Optional[int] = None) -> list[AtomEntry]: 113 + """Get entries for a user.""" 114 + return self.feeds.get_entries(username, limit) 115 + 116 + def get_entry(self, username: str, entry_id: str) -> Optional[AtomEntry]: 117 + """Get a specific entry.""" 118 + return self.feeds.get_entry(username, entry_id) 119 + 120 + def search_entries(self, query: str, username: Optional[str] = None, limit: Optional[int] = None) -> list[tuple[str, AtomEntry]]: 121 + """Search entries across users.""" 122 + return self.feeds.search_entries(query, username, limit) 123 + 124 + # Repository Management API 125 + def init_repository(self) -> bool: 126 + """Initialize the git repository.""" 127 + return self.repository.init_repository() 128 + 129 + def commit_changes(self, message: str) -> bool: 130 + """Commit all pending changes.""" 131 + return self.repository.commit_changes(message) 132 + 133 + def get_status(self) -> dict: 134 + """Get repository status and statistics.""" 135 + return self.repository.get_status() 136 + 137 + def backup_repository(self, backup_path: Path) -> bool: 138 + """Create a backup of the repository.""" 139 + return self.repository.backup_repository(backup_path) 140 + 141 + # Link Processing API 142 + def process_links(self, username: Optional[str] = None) -> dict: 143 + """Process and extract links from entries.""" 144 + return self.links.process_links(username) 145 + 146 + def get_links(self, username: Optional[str] = None) -> dict: 147 + """Get processed links.""" 148 + return self.links.get_links(username) 149 + 150 + def find_references(self, url: str) -> list[tuple[str, AtomEntry]]: 151 + """Find entries that reference a URL.""" 152 + return self.links.find_references(url) 153 + 154 + # Site Generation API 155 + def generate_site(self, output_dir: Path, template_dir: Optional[Path] = None) -> bool: 156 + """Generate static site.""" 157 + return self.site.generate_site(output_dir, template_dir) 158 + 159 + def generate_timeline(self, output_path: Path, limit: Optional[int] = None) -> bool: 160 + """Generate timeline HTML.""" 161 + return self.site.generate_timeline(output_path, limit) 162 + 163 + def generate_user_pages(self, output_dir: Path) -> bool: 164 + """Generate individual user pages.""" 165 + return self.site.generate_user_pages(output_dir) 166 + 167 + # Utility Methods 168 + def get_stats(self) -> dict: 169 + """Get comprehensive statistics.""" 170 + base_stats = self.repository.get_status() 171 + feed_stats = self.feeds.get_stats() 172 + link_stats = self.links.get_stats() 173 + 174 + return { 175 + **base_stats, 176 + **feed_stats, 177 + **link_stats, 178 + 'config': { 179 + 'git_store': str(self.config.git_store), 180 + 'cache_dir': str(self.config.cache_dir), 181 + 'total_users_configured': len(self.config.users), 182 + } 183 + } 184 + 185 + async def full_sync(self, progress_callback=None) -> dict: 186 + """Perform a complete sync: feeds -> links -> commit.""" 187 + results = {} 188 + 189 + # Sync feeds 190 + results['feeds'] = await self.sync_feeds(progress_callback=progress_callback) 191 + 192 + # Process links 193 + results['links'] = self.process_links() 194 + 195 + # Commit changes 196 + message = f"Sync completed at {datetime.now().isoformat()}" 197 + results['committed'] = self.commit_changes(message) 198 + 199 + return results 200 + 201 + def validate_config(self) -> list[str]: 202 + """Validate configuration and return any errors.""" 203 + errors = [] 204 + 205 + # Check paths exist 206 + if not self.config.git_store.parent.exists(): 207 + errors.append(f"Git store parent directory does not exist: {self.config.git_store.parent}") 208 + 209 + if not self.config.cache_dir.parent.exists(): 210 + errors.append(f"Cache directory parent does not exist: {self.config.cache_dir.parent}") 211 + 212 + # Validate user configs 213 + for user in self.config.users: 214 + if not user.feeds: 215 + errors.append(f"User {user.username} has no feeds configured") 216 + 217 + for feed_url in user.feeds: 218 + # Basic URL validation is handled by pydantic 219 + pass 220 + 221 + return errors 222 + 223 + def __enter__(self): 224 + """Context manager entry.""" 225 + return self 226 + 227 + def __exit__(self, exc_type, exc_val, exc_tb): 228 + """Context manager exit.""" 229 + # Could add cleanup logic here if needed 230 + pass

tests/__init__.py

This is a binary file and will not be displayed.

-84

tests/conftest.py

··· 1 - """Test configuration and fixtures for thicket.""" 2 - 3 - import tempfile 4 - from pathlib import Path 5 - 6 - import pytest 7 - 8 - from thicket.models import ThicketConfig, UserConfig 9 - 10 - 11 - @pytest.fixture 12 - def temp_dir(): 13 - """Create a temporary directory for tests.""" 14 - with tempfile.TemporaryDirectory() as tmp_dir: 15 - yield Path(tmp_dir) 16 - 17 - 18 - @pytest.fixture 19 - def sample_config(temp_dir): 20 - """Create a sample configuration for testing.""" 21 - git_store = temp_dir / "git_store" 22 - cache_dir = temp_dir / "cache" 23 - 24 - return ThicketConfig( 25 - git_store=git_store, 26 - cache_dir=cache_dir, 27 - users=[ 28 - UserConfig( 29 - username="testuser", 30 - feeds=["https://example.com/feed.xml"], 31 - email="test@example.com", 32 - display_name="Test User", 33 - ) 34 - ], 35 - ) 36 - 37 - 38 - @pytest.fixture 39 - def sample_atom_feed(): 40 - """Sample Atom feed XML for testing.""" 41 - return """<?xml version="1.0" encoding="utf-8"?> 42 - <feed xmlns="http://www.w3.org/2005/Atom"> 43 - <title>Test Feed</title> 44 - <link href="https://example.com/"/> 45 - <updated>2025-01-01T00:00:00Z</updated> 46 - <author> 47 - <name>Test Author</name> 48 - <email>author@example.com</email> 49 - </author> 50 - <id>https://example.com/</id> 51 - 52 - <entry> 53 - <title>Test Entry</title> 54 - <link href="https://example.com/entry/1"/> 55 - <id>https://example.com/entry/1</id> 56 - <updated>2025-01-01T00:00:00Z</updated> 57 - <summary>This is a test entry.</summary> 58 - <content type="html"> 59 - <![CDATA[<p>This is the content of the test entry.</p>]]> 60 - </content> 61 - </entry> 62 - </feed>""" 63 - 64 - 65 - @pytest.fixture 66 - def sample_rss_feed(): 67 - """Sample RSS feed XML for testing.""" 68 - return """<?xml version="1.0" encoding="UTF-8"?> 69 - <rss version="2.0"> 70 - <channel> 71 - <title>Test RSS Feed</title> 72 - <link>https://example.com/</link> 73 - <description>Test RSS feed for testing</description> 74 - <managingEditor>editor@example.com</managingEditor> 75 - 76 - <item> 77 - <title>Test RSS Entry</title> 78 - <link>https://example.com/rss/entry/1</link> 79 - <description>This is a test RSS entry.</description> 80 - <pubDate>Mon, 01 Jan 2025 00:00:00 GMT</pubDate> 81 - <guid>https://example.com/rss/entry/1</guid> 82 - </item> 83 - </channel> 84 - </rss>"""

-297

tests/test_bot.py

··· 1 - """Tests for the Thicket Zulip bot.""" 2 - 3 - import pytest 4 - 5 - from thicket.bots.test_bot import ( 6 - BotTester, 7 - MockBotHandler, 8 - create_test_entry, 9 - create_test_message, 10 - ) 11 - from thicket.bots.thicket_bot import ThicketBotHandler 12 - 13 - 14 - class TestThicketBot: 15 - """Test suite for ThicketBotHandler.""" 16 - 17 - def setup_method(self) -> None: 18 - """Set up test environment.""" 19 - self.bot = ThicketBotHandler() 20 - self.handler = MockBotHandler() 21 - 22 - def test_usage(self) -> None: 23 - """Test bot usage message.""" 24 - usage = self.bot.usage() 25 - assert "Thicket Feed Bot" in usage 26 - assert "@thicket status" in usage 27 - assert "@thicket config" in usage 28 - 29 - def test_help_command(self) -> None: 30 - """Test help command response.""" 31 - message = create_test_message("@thicket help") 32 - self.bot.handle_message(message, self.handler) 33 - 34 - assert len(self.handler.sent_messages) == 1 35 - response = self.handler.sent_messages[0]["content"] 36 - assert "Thicket Feed Bot" in response 37 - 38 - def test_status_command_unconfigured(self) -> None: 39 - """Test status command when bot is not configured.""" 40 - message = create_test_message("@thicket status") 41 - self.bot.handle_message(message, self.handler) 42 - 43 - assert len(self.handler.sent_messages) == 1 44 - response = self.handler.sent_messages[0]["content"] 45 - assert "Not configured" in response 46 - assert "Stream:" in response 47 - assert "Topic:" in response 48 - 49 - def test_config_stream_command(self) -> None: 50 - """Test setting stream configuration.""" 51 - message = create_test_message("@thicket config stream general") 52 - self.bot.handle_message(message, self.handler) 53 - 54 - assert len(self.handler.sent_messages) == 1 55 - response = self.handler.sent_messages[0]["content"] 56 - assert "Stream set to: **general**" in response 57 - assert self.bot.stream_name == "general" 58 - 59 - def test_config_topic_command(self) -> None: 60 - """Test setting topic configuration.""" 61 - message = create_test_message("@thicket config topic 'Feed Updates'") 62 - self.bot.handle_message(message, self.handler) 63 - 64 - assert len(self.handler.sent_messages) == 1 65 - response = self.handler.sent_messages[0]["content"] 66 - assert "Topic set to:" in response and "Feed Updates" in response 67 - assert self.bot.topic_name == "'Feed Updates'" 68 - 69 - def test_config_interval_command(self) -> None: 70 - """Test setting sync interval.""" 71 - message = create_test_message("@thicket config interval 600") 72 - self.bot.handle_message(message, self.handler) 73 - 74 - assert len(self.handler.sent_messages) == 1 75 - response = self.handler.sent_messages[0]["content"] 76 - assert "Sync interval set to: **600s**" in response 77 - assert self.bot.sync_interval == 600 78 - 79 - def test_config_interval_too_small(self) -> None: 80 - """Test setting sync interval that's too small.""" 81 - message = create_test_message("@thicket config interval 30") 82 - self.bot.handle_message(message, self.handler) 83 - 84 - assert len(self.handler.sent_messages) == 1 85 - response = self.handler.sent_messages[0]["content"] 86 - assert "must be at least 60 seconds" in response 87 - assert self.bot.sync_interval != 30 88 - 89 - def test_config_path_nonexistent(self) -> None: 90 - """Test setting config path that doesn't exist.""" 91 - message = create_test_message("@thicket config path /nonexistent/config.yaml") 92 - self.bot.handle_message(message, self.handler) 93 - 94 - assert len(self.handler.sent_messages) == 1 95 - response = self.handler.sent_messages[0]["content"] 96 - assert "Config file not found" in response 97 - 98 - def test_unknown_command(self) -> None: 99 - """Test unknown command handling.""" 100 - message = create_test_message("@thicket unknown") 101 - self.bot.handle_message(message, self.handler) 102 - 103 - assert len(self.handler.sent_messages) == 1 104 - response = self.handler.sent_messages[0]["content"] 105 - assert "Unknown command: unknown" in response 106 - 107 - def test_config_persistence(self) -> None: 108 - """Test that configuration is persisted.""" 109 - # Set some config 110 - self.bot.stream_name = "test-stream" 111 - self.bot.topic_name = "test-topic" 112 - self.bot.sync_interval = 600 113 - 114 - # Save config 115 - self.bot._save_bot_config(self.handler) 116 - 117 - # Create new bot instance 118 - new_bot = ThicketBotHandler() 119 - new_bot._load_bot_config(self.handler) 120 - 121 - # Check config was loaded 122 - assert new_bot.stream_name == "test-stream" 123 - assert new_bot.topic_name == "test-topic" 124 - assert new_bot.sync_interval == 600 125 - 126 - def test_posted_entries_persistence(self) -> None: 127 - """Test that posted entries are persisted.""" 128 - # Add some entries 129 - self.bot.posted_entries = {"user1:entry1", "user2:entry2"} 130 - 131 - # Save entries 132 - self.bot._save_posted_entries(self.handler) 133 - 134 - # Create new bot instance 135 - new_bot = ThicketBotHandler() 136 - new_bot._load_posted_entries(self.handler) 137 - 138 - # Check entries were loaded 139 - assert new_bot.posted_entries == {"user1:entry1", "user2:entry2"} 140 - 141 - def test_mention_detection(self) -> None: 142 - """Test bot mention detection.""" 143 - assert self.bot._is_mentioned("@Thicket Bot help", self.handler) 144 - assert self.bot._is_mentioned("@thicket status", self.handler) 145 - assert not self.bot._is_mentioned("regular message", self.handler) 146 - 147 - def test_mention_cleaning(self) -> None: 148 - """Test cleaning mentions from messages.""" 149 - cleaned = self.bot._clean_mention("@Thicket Bot status", self.handler) 150 - assert cleaned == "status" 151 - 152 - cleaned = self.bot._clean_mention("@thicket help", self.handler) 153 - assert cleaned == "help" 154 - 155 - def test_sync_now_uninitialized(self) -> None: 156 - """Test sync now command when not initialized.""" 157 - message = create_test_message("@thicket sync now") 158 - self.bot.handle_message(message, self.handler) 159 - 160 - assert len(self.handler.sent_messages) == 1 161 - response = self.handler.sent_messages[0]["content"] 162 - assert "not initialized" in response.lower() 163 - 164 - def test_debug_mode_initialization(self) -> None: 165 - """Test debug mode initialization.""" 166 - import os 167 - 168 - # Mock environment variable 169 - os.environ["THICKET_DEBUG_USER"] = "testuser" 170 - 171 - try: 172 - bot = ThicketBotHandler() 173 - # Simulate initialize call 174 - bot.debug_user = os.getenv("THICKET_DEBUG_USER") 175 - 176 - assert bot.debug_user == "testuser" 177 - assert bot.debug_zulip_user_id is None # Not validated yet 178 - finally: 179 - # Clean up 180 - if "THICKET_DEBUG_USER" in os.environ: 181 - del os.environ["THICKET_DEBUG_USER"] 182 - 183 - def test_debug_mode_status(self) -> None: 184 - """Test status command in debug mode.""" 185 - self.bot.debug_user = "testuser" 186 - self.bot.debug_zulip_user_id = "test.user" 187 - 188 - message = create_test_message("@thicket status") 189 - self.bot.handle_message(message, self.handler) 190 - 191 - assert len(self.handler.sent_messages) == 1 192 - response = self.handler.sent_messages[0]["content"] 193 - assert "**Debug Mode:** ENABLED" in response 194 - assert "**Debug User:** testuser" in response 195 - assert "**Debug Zulip ID:** test.user" in response 196 - 197 - def test_debug_mode_check_initialization(self) -> None: 198 - """Test initialization check in debug mode.""" 199 - from unittest.mock import Mock 200 - 201 - # Setup mock git store and config 202 - self.bot.git_store = Mock() 203 - self.bot.config = Mock() 204 - self.bot.debug_user = "testuser" 205 - self.bot.debug_zulip_user_id = "test.user" 206 - 207 - message = create_test_message("@thicket sync now") 208 - 209 - # Should pass with debug mode properly set up 210 - result = self.bot._check_initialization(message, self.handler) 211 - assert result is True 212 - 213 - # Should fail if debug_zulip_user_id is missing 214 - self.bot.debug_zulip_user_id = None 215 - result = self.bot._check_initialization(message, self.handler) 216 - assert result is False 217 - assert len(self.handler.sent_messages) == 1 218 - assert ( 219 - "Debug mode validation failed" in self.handler.sent_messages[0]["content"] 220 - ) 221 - 222 - def test_debug_mode_dm_posting(self) -> None: 223 - """Test that debug mode posts DMs instead of stream messages.""" 224 - from unittest.mock import Mock 225 - 226 - # Setup bot in debug mode 227 - self.bot.debug_user = "testuser" 228 - self.bot.debug_zulip_user_id = "test.user@example.com" 229 - self.bot.git_store = Mock() 230 - 231 - # Create a test entry 232 - entry = create_test_entry() 233 - 234 - # Mock the handler config 235 - self.handler.config_info = { 236 - "full_name": "Thicket Bot", 237 - "email": "thicket-bot@example.com", 238 - "site": "https://example.zulipchat.com", 239 - } 240 - 241 - # Mock git store user 242 - mock_user = Mock() 243 - mock_user.get_zulip_mention.return_value = "author.user" 244 - self.bot.git_store.get_user.return_value = mock_user 245 - 246 - # Post entry 247 - self.bot._post_entry_to_zulip(entry, self.handler, "testauthor") 248 - 249 - # Check that a DM was sent 250 - assert len(self.handler.sent_messages) == 1 251 - message = self.handler.sent_messages[0] 252 - 253 - # Verify it's a DM 254 - assert message["type"] == "private" 255 - assert message["to"] == ["test.user@example.com"] 256 - assert "DEBUG:" in message["content"] 257 - assert entry.title in message["content"] 258 - assert "@**author.user** posted:" in message["content"] 259 - 260 - 261 - class TestBotTester: 262 - """Test the bot testing utilities.""" 263 - 264 - def test_bot_tester_basic(self) -> None: 265 - """Test basic bot tester functionality.""" 266 - tester = BotTester() 267 - 268 - # Test help command 269 - responses = tester.send_command("help") 270 - assert len(responses) == 1 271 - assert "Thicket Feed Bot" in tester.get_last_response_content() 272 - 273 - def test_bot_tester_config(self) -> None: 274 - """Test bot tester configuration.""" 275 - tester = BotTester() 276 - 277 - # Configure stream 278 - tester.send_command("config stream general") 279 - tester.assert_response_contains("Stream set to") 280 - 281 - # Configure topic 282 - tester.send_command("config topic test") 283 - tester.assert_response_contains("Topic set to") 284 - 285 - def test_assert_response_contains(self) -> None: 286 - """Test response assertion helper.""" 287 - tester = BotTester() 288 - 289 - # Send command 290 - tester.send_command("help") 291 - 292 - # This should pass 293 - tester.assert_response_contains("Thicket Feed Bot") 294 - 295 - # This should fail 296 - with pytest.raises(AssertionError): 297 - tester.assert_response_contains("nonexistent text")

-132

tests/test_feed_parser.py

··· 1 - """Tests for feed parser functionality.""" 2 - 3 - from pydantic import HttpUrl 4 - 5 - from thicket.core.feed_parser import FeedParser 6 - from thicket.models import AtomEntry, FeedMetadata 7 - 8 - 9 - class TestFeedParser: 10 - """Test the FeedParser class.""" 11 - 12 - def test_init(self): 13 - """Test parser initialization.""" 14 - parser = FeedParser() 15 - assert parser.user_agent == "thicket/0.1.0" 16 - assert "a" in parser.allowed_tags 17 - assert "href" in parser.allowed_attributes["a"] 18 - 19 - def test_parse_atom_feed(self, sample_atom_feed): 20 - """Test parsing an Atom feed.""" 21 - parser = FeedParser() 22 - metadata, entries = parser.parse_feed(sample_atom_feed) 23 - 24 - # Check metadata 25 - assert isinstance(metadata, FeedMetadata) 26 - assert metadata.title == "Test Feed" 27 - assert metadata.author_name == "Test Author" 28 - assert metadata.author_email == "author@example.com" 29 - assert metadata.link == HttpUrl("https://example.com/") 30 - 31 - # Check entries 32 - assert len(entries) == 1 33 - entry = entries[0] 34 - assert isinstance(entry, AtomEntry) 35 - assert entry.title == "Test Entry" 36 - assert entry.id == "https://example.com/entry/1" 37 - assert entry.link == HttpUrl("https://example.com/entry/1") 38 - assert entry.summary == "This is a test entry." 39 - assert "<p>This is the content of the test entry.</p>" in entry.content 40 - 41 - def test_parse_rss_feed(self, sample_rss_feed): 42 - """Test parsing an RSS feed.""" 43 - parser = FeedParser() 44 - metadata, entries = parser.parse_feed(sample_rss_feed) 45 - 46 - # Check metadata 47 - assert isinstance(metadata, FeedMetadata) 48 - assert metadata.title == "Test RSS Feed" 49 - assert metadata.link == HttpUrl("https://example.com/") 50 - assert metadata.author_email == "editor@example.com" 51 - 52 - # Check entries 53 - assert len(entries) == 1 54 - entry = entries[0] 55 - assert isinstance(entry, AtomEntry) 56 - assert entry.title == "Test RSS Entry" 57 - assert entry.id == "https://example.com/rss/entry/1" 58 - assert entry.summary == "This is a test RSS entry." 59 - 60 - def test_sanitize_entry_id(self): 61 - """Test entry ID sanitization.""" 62 - parser = FeedParser() 63 - 64 - # Test URL ID 65 - url_id = "https://example.com/posts/2025/01/test-post" 66 - sanitized = parser.sanitize_entry_id(url_id) 67 - assert sanitized == "posts_2025_01_test-post" 68 - 69 - # Test problematic characters 70 - bad_id = "test/with\\bad:chars|and<more>" 71 - sanitized = parser.sanitize_entry_id(bad_id) 72 - assert sanitized == "test_with_bad_chars_and_more_" 73 - 74 - # Test empty ID 75 - empty_id = "" 76 - sanitized = parser.sanitize_entry_id(empty_id) 77 - assert sanitized == "entry" 78 - 79 - # Test very long ID 80 - long_id = "a" * 300 81 - sanitized = parser.sanitize_entry_id(long_id) 82 - assert len(sanitized) == 200 83 - 84 - def test_sanitize_html(self): 85 - """Test HTML sanitization.""" 86 - parser = FeedParser() 87 - 88 - # Test allowed tags 89 - safe_html = "<p>This is <strong>safe</strong> HTML</p>" 90 - sanitized = parser._sanitize_html(safe_html) 91 - assert sanitized == safe_html 92 - 93 - # Test dangerous tags 94 - dangerous_html = "<script>alert('xss')</script><p>Safe content</p>" 95 - sanitized = parser._sanitize_html(dangerous_html) 96 - assert "<script>" not in sanitized 97 - assert "<p>Safe content</p>" in sanitized 98 - 99 - # Test attributes 100 - html_with_attrs = '<a href="https://example.com" onclick="alert()">Link</a>' 101 - sanitized = parser._sanitize_html(html_with_attrs) 102 - assert 'href="https://example.com"' in sanitized 103 - assert "onclick" not in sanitized 104 - 105 - def test_extract_feed_metadata(self): 106 - """Test feed metadata extraction.""" 107 - parser = FeedParser() 108 - 109 - # Test with feedparser parsed data 110 - import feedparser 111 - 112 - parsed = feedparser.parse("""<?xml version="1.0" encoding="utf-8"?> 113 - <feed xmlns="http://www.w3.org/2005/Atom"> 114 - <title>Test Feed</title> 115 - <link href="https://example.com/"/> 116 - <author> 117 - <name>Test Author</name> 118 - <email>author@example.com</email> 119 - <uri>https://example.com/about</uri> 120 - </author> 121 - <logo>https://example.com/logo.png</logo> 122 - <icon>https://example.com/icon.png</icon> 123 - </feed>""") 124 - 125 - metadata = parser._extract_feed_metadata(parsed.feed) 126 - assert metadata.title == "Test Feed" 127 - assert metadata.author_name == "Test Author" 128 - assert metadata.author_email == "author@example.com" 129 - assert metadata.author_uri == HttpUrl("https://example.com/about") 130 - assert metadata.link == HttpUrl("https://example.com/") 131 - assert metadata.logo == HttpUrl("https://example.com/logo.png") 132 - assert metadata.icon == HttpUrl("https://example.com/icon.png")

-280

tests/test_git_store.py

··· 1 - """Tests for Git store functionality.""" 2 - 3 - import json 4 - from datetime import datetime 5 - 6 - from pydantic import HttpUrl 7 - 8 - from thicket.core.git_store import GitStore 9 - from thicket.models import AtomEntry, DuplicateMap, UserMetadata 10 - 11 - 12 - class TestGitStore: 13 - """Test the GitStore class.""" 14 - 15 - def test_init_new_repo(self, temp_dir): 16 - """Test initializing a new Git repository.""" 17 - repo_path = temp_dir / "test_repo" 18 - store = GitStore(repo_path) 19 - 20 - assert store.repo_path == repo_path 21 - assert store.repo is not None 22 - assert repo_path.exists() 23 - assert (repo_path / ".git").exists() 24 - assert (repo_path / "index.json").exists() 25 - assert (repo_path / "duplicates.json").exists() 26 - 27 - def test_init_existing_repo(self, temp_dir): 28 - """Test initializing with existing repository.""" 29 - repo_path = temp_dir / "test_repo" 30 - 31 - # Create first store 32 - store1 = GitStore(repo_path) 33 - store1.add_user("testuser", display_name="Test User") 34 - 35 - # Create second store pointing to same repo 36 - store2 = GitStore(repo_path) 37 - user = store2.get_user("testuser") 38 - 39 - assert user is not None 40 - assert user.username == "testuser" 41 - assert user.display_name == "Test User" 42 - 43 - def test_add_user(self, temp_dir): 44 - """Test adding a user to the Git store.""" 45 - store = GitStore(temp_dir / "test_repo") 46 - 47 - user = store.add_user( 48 - username="testuser", 49 - display_name="Test User", 50 - email="test@example.com", 51 - homepage="https://example.com", 52 - icon="https://example.com/icon.png", 53 - feeds=["https://example.com/feed.xml"], 54 - ) 55 - 56 - assert isinstance(user, UserMetadata) 57 - assert user.username == "testuser" 58 - assert user.display_name == "Test User" 59 - assert user.email == "test@example.com" 60 - assert user.homepage == "https://example.com" 61 - assert user.icon == "https://example.com/icon.png" 62 - assert user.feeds == ["https://example.com/feed.xml"] 63 - assert user.directory == "testuser" 64 - 65 - # Check that user directory was created 66 - user_dir = store.repo_path / "testuser" 67 - assert user_dir.exists() 68 - 69 - # Check user exists in index 70 - stored_user = store.get_user("testuser") 71 - assert stored_user is not None 72 - assert stored_user.username == "testuser" 73 - assert stored_user.display_name == "Test User" 74 - 75 - def test_get_user(self, temp_dir): 76 - """Test getting user metadata.""" 77 - store = GitStore(temp_dir / "test_repo") 78 - 79 - # Add user 80 - store.add_user("testuser", display_name="Test User") 81 - 82 - # Get user 83 - user = store.get_user("testuser") 84 - assert user is not None 85 - assert user.username == "testuser" 86 - assert user.display_name == "Test User" 87 - 88 - # Try to get non-existent user 89 - non_user = store.get_user("nonexistent") 90 - assert non_user is None 91 - 92 - def test_store_entry(self, temp_dir): 93 - """Test storing an entry.""" 94 - store = GitStore(temp_dir / "test_repo") 95 - 96 - # Add user first 97 - store.add_user("testuser") 98 - 99 - # Create test entry 100 - entry = AtomEntry( 101 - id="https://example.com/entry/1", 102 - title="Test Entry", 103 - link=HttpUrl("https://example.com/entry/1"), 104 - updated=datetime.now(), 105 - summary="Test entry summary", 106 - content="<p>Test content</p>", 107 - ) 108 - 109 - # Store entry 110 - result = store.store_entry("testuser", entry) 111 - assert result is True 112 - 113 - # Check that entry file was created 114 - user_dir = store.repo_path / "testuser" 115 - entry_files = list(user_dir.glob("*.json")) 116 - entry_files = [f for f in entry_files if f.name != "metadata.json"] 117 - assert len(entry_files) == 1 118 - 119 - # Check entry content 120 - with open(entry_files[0]) as f: 121 - stored_entry = json.load(f) 122 - assert stored_entry["title"] == "Test Entry" 123 - assert stored_entry["id"] == "https://example.com/entry/1" 124 - 125 - def test_get_entry(self, temp_dir): 126 - """Test retrieving an entry.""" 127 - store = GitStore(temp_dir / "test_repo") 128 - 129 - # Add user and entry 130 - store.add_user("testuser") 131 - entry = AtomEntry( 132 - id="https://example.com/entry/1", 133 - title="Test Entry", 134 - link=HttpUrl("https://example.com/entry/1"), 135 - updated=datetime.now(), 136 - ) 137 - store.store_entry("testuser", entry) 138 - 139 - # Get entry 140 - retrieved = store.get_entry("testuser", "https://example.com/entry/1") 141 - assert retrieved is not None 142 - assert retrieved.title == "Test Entry" 143 - assert retrieved.id == "https://example.com/entry/1" 144 - 145 - # Try to get non-existent entry 146 - non_entry = store.get_entry("testuser", "https://example.com/nonexistent") 147 - assert non_entry is None 148 - 149 - def test_list_entries(self, temp_dir): 150 - """Test listing entries for a user.""" 151 - store = GitStore(temp_dir / "test_repo") 152 - 153 - # Add user 154 - store.add_user("testuser") 155 - 156 - # Add multiple entries 157 - for i in range(3): 158 - entry = AtomEntry( 159 - id=f"https://example.com/entry/{i}", 160 - title=f"Test Entry {i}", 161 - link=HttpUrl(f"https://example.com/entry/{i}"), 162 - updated=datetime.now(), 163 - ) 164 - store.store_entry("testuser", entry) 165 - 166 - # List all entries 167 - entries = store.list_entries("testuser") 168 - assert len(entries) == 3 169 - 170 - # List with limit 171 - limited = store.list_entries("testuser", limit=2) 172 - assert len(limited) == 2 173 - 174 - # List for non-existent user 175 - none_entries = store.list_entries("nonexistent") 176 - assert len(none_entries) == 0 177 - 178 - def test_duplicates(self, temp_dir): 179 - """Test duplicate management.""" 180 - store = GitStore(temp_dir / "test_repo") 181 - 182 - # Get initial duplicates (should be empty) 183 - duplicates = store.get_duplicates() 184 - assert isinstance(duplicates, DuplicateMap) 185 - assert len(duplicates.duplicates) == 0 186 - 187 - # Add duplicate 188 - store.add_duplicate("https://example.com/dup", "https://example.com/canonical") 189 - 190 - # Check duplicate was added 191 - duplicates = store.get_duplicates() 192 - assert len(duplicates.duplicates) == 1 193 - assert duplicates.is_duplicate("https://example.com/dup") 194 - assert ( 195 - duplicates.get_canonical("https://example.com/dup") 196 - == "https://example.com/canonical" 197 - ) 198 - 199 - # Remove duplicate 200 - result = store.remove_duplicate("https://example.com/dup") 201 - assert result is True 202 - 203 - # Check duplicate was removed 204 - duplicates = store.get_duplicates() 205 - assert len(duplicates.duplicates) == 0 206 - assert not duplicates.is_duplicate("https://example.com/dup") 207 - 208 - def test_search_entries(self, temp_dir): 209 - """Test searching entries.""" 210 - store = GitStore(temp_dir / "test_repo") 211 - 212 - # Add user 213 - store.add_user("testuser") 214 - 215 - # Add entries with different content 216 - entries_data = [ 217 - ("Test Python Programming", "Learning Python basics"), 218 - ("JavaScript Tutorial", "Advanced JavaScript concepts"), 219 - ("Python Web Development", "Building web apps with Python"), 220 - ] 221 - 222 - for title, summary in entries_data: 223 - entry = AtomEntry( 224 - id=f"https://example.com/entry/{title.lower().replace(' ', '-')}", 225 - title=title, 226 - link=HttpUrl( 227 - f"https://example.com/entry/{title.lower().replace(' ', '-')}" 228 - ), 229 - updated=datetime.now(), 230 - summary=summary, 231 - ) 232 - store.store_entry("testuser", entry) 233 - 234 - # Search for Python entries 235 - results = store.search_entries("Python") 236 - assert len(results) == 2 237 - 238 - # Search for specific user 239 - results = store.search_entries("Python", username="testuser") 240 - assert len(results) == 2 241 - 242 - # Search with limit 243 - results = store.search_entries("Python", limit=1) 244 - assert len(results) == 1 245 - 246 - # Search for non-existent term 247 - results = store.search_entries("NonExistent") 248 - assert len(results) == 0 249 - 250 - def test_get_stats(self, temp_dir): 251 - """Test getting repository statistics.""" 252 - store = GitStore(temp_dir / "test_repo") 253 - 254 - # Get initial stats 255 - stats = store.get_stats() 256 - assert stats["total_users"] == 0 257 - assert stats["total_entries"] == 0 258 - assert stats["total_duplicates"] == 0 259 - 260 - # Add user and entries 261 - store.add_user("testuser") 262 - for i in range(3): 263 - entry = AtomEntry( 264 - id=f"https://example.com/entry/{i}", 265 - title=f"Test Entry {i}", 266 - link=HttpUrl(f"https://example.com/entry/{i}"), 267 - updated=datetime.now(), 268 - ) 269 - store.store_entry("testuser", entry) 270 - 271 - # Add duplicate 272 - store.add_duplicate("https://example.com/dup", "https://example.com/canonical") 273 - 274 - # Get updated stats 275 - stats = store.get_stats() 276 - assert stats["total_users"] == 1 277 - assert stats["total_entries"] == 3 278 - assert stats["total_duplicates"] == 1 279 - assert "last_updated" in stats 280 - assert "repository_size" in stats

-436

tests/test_models.py

··· 1 - """Tests for pydantic models.""" 2 - 3 - from datetime import datetime 4 - 5 - import pytest 6 - from pydantic import HttpUrl, ValidationError 7 - 8 - from thicket.models import ( 9 - AtomEntry, 10 - DuplicateMap, 11 - FeedMetadata, 12 - ThicketConfig, 13 - UserConfig, 14 - UserMetadata, 15 - ZulipAssociation, 16 - ) 17 - 18 - 19 - class TestUserConfig: 20 - """Test UserConfig model.""" 21 - 22 - def test_valid_user_config(self): 23 - """Test creating valid user config.""" 24 - config = UserConfig( 25 - username="testuser", 26 - feeds=["https://example.com/feed.xml"], 27 - email="test@example.com", 28 - homepage="https://example.com", 29 - display_name="Test User", 30 - ) 31 - 32 - assert config.username == "testuser" 33 - assert len(config.feeds) == 1 34 - assert config.feeds[0] == HttpUrl("https://example.com/feed.xml") 35 - assert config.email == "test@example.com" 36 - assert config.display_name == "Test User" 37 - 38 - def test_invalid_email(self): 39 - """Test validation of invalid email.""" 40 - with pytest.raises(ValidationError): 41 - UserConfig( 42 - username="testuser", 43 - feeds=["https://example.com/feed.xml"], 44 - email="invalid-email", 45 - ) 46 - 47 - def test_invalid_feed_url(self): 48 - """Test validation of invalid feed URL.""" 49 - with pytest.raises(ValidationError): 50 - UserConfig( 51 - username="testuser", 52 - feeds=["not-a-url"], 53 - ) 54 - 55 - def test_optional_fields(self): 56 - """Test optional fields with None values.""" 57 - config = UserConfig( 58 - username="testuser", 59 - feeds=["https://example.com/feed.xml"], 60 - ) 61 - 62 - assert config.email is None 63 - assert config.homepage is None 64 - assert config.icon is None 65 - assert config.display_name is None 66 - 67 - 68 - class TestThicketConfig: 69 - """Test ThicketConfig model.""" 70 - 71 - def test_valid_config(self, temp_dir): 72 - """Test creating valid configuration.""" 73 - config = ThicketConfig( 74 - git_store=temp_dir / "git_store", 75 - cache_dir=temp_dir / "cache", 76 - users=[ 77 - UserConfig( 78 - username="testuser", 79 - feeds=["https://example.com/feed.xml"], 80 - ) 81 - ], 82 - ) 83 - 84 - assert config.git_store == temp_dir / "git_store" 85 - assert config.cache_dir == temp_dir / "cache" 86 - assert len(config.users) == 1 87 - assert config.users[0].username == "testuser" 88 - 89 - def test_find_user(self, temp_dir): 90 - """Test finding user by username.""" 91 - config = ThicketConfig( 92 - git_store=temp_dir / "git_store", 93 - cache_dir=temp_dir / "cache", 94 - users=[ 95 - UserConfig(username="user1", feeds=["https://example.com/feed1.xml"]), 96 - UserConfig(username="user2", feeds=["https://example.com/feed2.xml"]), 97 - ], 98 - ) 99 - 100 - user = config.find_user("user1") 101 - assert user is not None 102 - assert user.username == "user1" 103 - 104 - non_user = config.find_user("nonexistent") 105 - assert non_user is None 106 - 107 - def test_add_user(self, temp_dir): 108 - """Test adding a new user.""" 109 - config = ThicketConfig( 110 - git_store=temp_dir / "git_store", 111 - cache_dir=temp_dir / "cache", 112 - users=[], 113 - ) 114 - 115 - new_user = UserConfig( 116 - username="newuser", 117 - feeds=["https://example.com/feed.xml"], 118 - ) 119 - 120 - config.add_user(new_user) 121 - assert len(config.users) == 1 122 - assert config.users[0].username == "newuser" 123 - 124 - def test_add_feed_to_user(self, temp_dir): 125 - """Test adding feed to existing user.""" 126 - config = ThicketConfig( 127 - git_store=temp_dir / "git_store", 128 - cache_dir=temp_dir / "cache", 129 - users=[ 130 - UserConfig( 131 - username="testuser", feeds=["https://example.com/feed1.xml"] 132 - ), 133 - ], 134 - ) 135 - 136 - result = config.add_feed_to_user( 137 - "testuser", HttpUrl("https://example.com/feed2.xml") 138 - ) 139 - assert result is True 140 - 141 - user = config.find_user("testuser") 142 - assert len(user.feeds) == 2 143 - assert HttpUrl("https://example.com/feed2.xml") in user.feeds 144 - 145 - # Test adding to non-existent user 146 - result = config.add_feed_to_user( 147 - "nonexistent", HttpUrl("https://example.com/feed.xml") 148 - ) 149 - assert result is False 150 - 151 - 152 - class TestAtomEntry: 153 - """Test AtomEntry model.""" 154 - 155 - def test_valid_entry(self): 156 - """Test creating valid Atom entry.""" 157 - entry = AtomEntry( 158 - id="https://example.com/entry/1", 159 - title="Test Entry", 160 - link=HttpUrl("https://example.com/entry/1"), 161 - updated=datetime.now(), 162 - published=datetime.now(), 163 - summary="Test summary", 164 - content="<p>Test content</p>", 165 - content_type="html", 166 - author={"name": "Test Author"}, 167 - categories=["test", "example"], 168 - ) 169 - 170 - assert entry.id == "https://example.com/entry/1" 171 - assert entry.title == "Test Entry" 172 - assert entry.summary == "Test summary" 173 - assert entry.content == "<p>Test content</p>" 174 - assert entry.content_type == "html" 175 - assert entry.author["name"] == "Test Author" 176 - assert "test" in entry.categories 177 - 178 - def test_minimal_entry(self): 179 - """Test creating minimal Atom entry.""" 180 - entry = AtomEntry( 181 - id="https://example.com/entry/1", 182 - title="Test Entry", 183 - link=HttpUrl("https://example.com/entry/1"), 184 - updated=datetime.now(), 185 - ) 186 - 187 - assert entry.id == "https://example.com/entry/1" 188 - assert entry.title == "Test Entry" 189 - assert entry.published is None 190 - assert entry.summary is None 191 - assert entry.content is None 192 - assert entry.content_type == "html" # default 193 - assert entry.author is None 194 - assert entry.categories == [] 195 - 196 - 197 - class TestDuplicateMap: 198 - """Test DuplicateMap model.""" 199 - 200 - def test_empty_duplicates(self): 201 - """Test empty duplicate map.""" 202 - dup_map = DuplicateMap() 203 - assert len(dup_map.duplicates) == 0 204 - assert not dup_map.is_duplicate("test") 205 - assert dup_map.get_canonical("test") == "test" 206 - 207 - def test_add_duplicate(self): 208 - """Test adding duplicate mapping.""" 209 - dup_map = DuplicateMap() 210 - dup_map.add_duplicate("dup1", "canonical1") 211 - 212 - assert len(dup_map.duplicates) == 1 213 - assert dup_map.is_duplicate("dup1") 214 - assert dup_map.get_canonical("dup1") == "canonical1" 215 - assert dup_map.get_canonical("canonical1") == "canonical1" 216 - 217 - def test_remove_duplicate(self): 218 - """Test removing duplicate mapping.""" 219 - dup_map = DuplicateMap() 220 - dup_map.add_duplicate("dup1", "canonical1") 221 - 222 - result = dup_map.remove_duplicate("dup1") 223 - assert result is True 224 - assert len(dup_map.duplicates) == 0 225 - assert not dup_map.is_duplicate("dup1") 226 - 227 - # Test removing non-existent duplicate 228 - result = dup_map.remove_duplicate("nonexistent") 229 - assert result is False 230 - 231 - def test_get_duplicates_for_canonical(self): 232 - """Test getting all duplicates for a canonical ID.""" 233 - dup_map = DuplicateMap() 234 - dup_map.add_duplicate("dup1", "canonical1") 235 - dup_map.add_duplicate("dup2", "canonical1") 236 - dup_map.add_duplicate("dup3", "canonical2") 237 - 238 - dups = dup_map.get_duplicates_for_canonical("canonical1") 239 - assert len(dups) == 2 240 - assert "dup1" in dups 241 - assert "dup2" in dups 242 - 243 - dups = dup_map.get_duplicates_for_canonical("canonical2") 244 - assert len(dups) == 1 245 - assert "dup3" in dups 246 - 247 - dups = dup_map.get_duplicates_for_canonical("nonexistent") 248 - assert len(dups) == 0 249 - 250 - 251 - class TestFeedMetadata: 252 - """Test FeedMetadata model.""" 253 - 254 - def test_valid_metadata(self): 255 - """Test creating valid feed metadata.""" 256 - metadata = FeedMetadata( 257 - title="Test Feed", 258 - author_name="Test Author", 259 - author_email="author@example.com", 260 - author_uri=HttpUrl("https://example.com/author"), 261 - link=HttpUrl("https://example.com"), 262 - description="Test description", 263 - ) 264 - 265 - assert metadata.title == "Test Feed" 266 - assert metadata.author_name == "Test Author" 267 - assert metadata.author_email == "author@example.com" 268 - assert metadata.link == HttpUrl("https://example.com") 269 - 270 - def test_to_user_config(self): 271 - """Test converting metadata to user config.""" 272 - metadata = FeedMetadata( 273 - title="Test Feed", 274 - author_name="Test Author", 275 - author_email="author@example.com", 276 - author_uri=HttpUrl("https://example.com/author"), 277 - link=HttpUrl("https://example.com"), 278 - logo=HttpUrl("https://example.com/logo.png"), 279 - ) 280 - 281 - feed_url = HttpUrl("https://example.com/feed.xml") 282 - user_config = metadata.to_user_config("testuser", feed_url) 283 - 284 - assert user_config.username == "testuser" 285 - assert user_config.feeds == [feed_url] 286 - assert user_config.display_name == "Test Author" 287 - assert user_config.email == "author@example.com" 288 - assert user_config.homepage == HttpUrl("https://example.com/author") 289 - assert user_config.icon == HttpUrl("https://example.com/logo.png") 290 - 291 - def test_to_user_config_fallbacks(self): 292 - """Test fallback logic in to_user_config.""" 293 - metadata = FeedMetadata( 294 - title="Test Feed", 295 - link=HttpUrl("https://example.com"), 296 - icon=HttpUrl("https://example.com/icon.png"), 297 - ) 298 - 299 - feed_url = HttpUrl("https://example.com/feed.xml") 300 - user_config = metadata.to_user_config("testuser", feed_url) 301 - 302 - assert user_config.display_name == "Test Feed" # Falls back to title 303 - assert user_config.homepage == HttpUrl( 304 - "https://example.com" 305 - ) # Falls back to link 306 - assert user_config.icon == HttpUrl("https://example.com/icon.png") 307 - assert user_config.email is None 308 - 309 - 310 - class TestUserMetadata: 311 - """Test UserMetadata model.""" 312 - 313 - def test_valid_metadata(self): 314 - """Test creating valid user metadata.""" 315 - now = datetime.now() 316 - metadata = UserMetadata( 317 - username="testuser", 318 - directory="testuser", 319 - created=now, 320 - last_updated=now, 321 - feeds=["https://example.com/feed.xml"], 322 - entry_count=5, 323 - ) 324 - 325 - assert metadata.username == "testuser" 326 - assert metadata.directory == "testuser" 327 - assert metadata.entry_count == 5 328 - assert len(metadata.feeds) == 1 329 - 330 - def test_update_timestamp(self): 331 - """Test updating timestamp.""" 332 - now = datetime.now() 333 - metadata = UserMetadata( 334 - username="testuser", 335 - directory="testuser", 336 - created=now, 337 - last_updated=now, 338 - ) 339 - 340 - original_time = metadata.last_updated 341 - metadata.update_timestamp() 342 - 343 - assert metadata.last_updated > original_time 344 - 345 - def test_increment_entry_count(self): 346 - """Test incrementing entry count.""" 347 - metadata = UserMetadata( 348 - username="testuser", 349 - directory="testuser", 350 - created=datetime.now(), 351 - last_updated=datetime.now(), 352 - entry_count=5, 353 - ) 354 - 355 - original_count = metadata.entry_count 356 - original_time = metadata.last_updated 357 - 358 - metadata.increment_entry_count(3) 359 - 360 - assert metadata.entry_count == original_count + 3 361 - assert metadata.last_updated > original_time 362 - 363 - def test_zulip_associations(self): 364 - """Test Zulip association methods.""" 365 - metadata = UserMetadata( 366 - username="testuser", 367 - directory="testuser", 368 - created=datetime.now(), 369 - last_updated=datetime.now(), 370 - ) 371 - 372 - # Test adding association 373 - result = metadata.add_zulip_association("example.zulipchat.com", "alice") 374 - assert result is True 375 - assert len(metadata.zulip_associations) == 1 376 - assert metadata.zulip_associations[0].server == "example.zulipchat.com" 377 - assert metadata.zulip_associations[0].user_id == "alice" 378 - 379 - # Test adding duplicate association 380 - result = metadata.add_zulip_association("example.zulipchat.com", "alice") 381 - assert result is False 382 - assert len(metadata.zulip_associations) == 1 383 - 384 - # Test adding different association 385 - result = metadata.add_zulip_association("other.zulipchat.com", "alice") 386 - assert result is True 387 - assert len(metadata.zulip_associations) == 2 388 - 389 - # Test get_zulip_mention 390 - mention = metadata.get_zulip_mention("example.zulipchat.com") 391 - assert mention == "alice" 392 - 393 - mention = metadata.get_zulip_mention("other.zulipchat.com") 394 - assert mention == "alice" 395 - 396 - mention = metadata.get_zulip_mention("nonexistent.zulipchat.com") 397 - assert mention is None 398 - 399 - # Test removing association 400 - result = metadata.remove_zulip_association("example.zulipchat.com", "alice") 401 - assert result is True 402 - assert len(metadata.zulip_associations) == 1 403 - 404 - # Test removing non-existent association 405 - result = metadata.remove_zulip_association("example.zulipchat.com", "alice") 406 - assert result is False 407 - assert len(metadata.zulip_associations) == 1 408 - 409 - 410 - class TestZulipAssociation: 411 - """Test ZulipAssociation model.""" 412 - 413 - def test_valid_association(self): 414 - """Test creating valid Zulip association.""" 415 - assoc = ZulipAssociation( 416 - server="example.zulipchat.com", user_id="alice@example.com" 417 - ) 418 - 419 - assert assoc.server == "example.zulipchat.com" 420 - assert assoc.user_id == "alice@example.com" 421 - 422 - def test_association_hash(self): 423 - """Test that associations are hashable.""" 424 - assoc1 = ZulipAssociation(server="example.zulipchat.com", user_id="alice") 425 - assoc2 = ZulipAssociation(server="example.zulipchat.com", user_id="alice") 426 - assoc3 = ZulipAssociation(server="other.zulipchat.com", user_id="alice") 427 - 428 - # Same associations should have same hash 429 - assert hash(assoc1) == hash(assoc2) 430 - 431 - # Different associations should have different hash 432 - assert hash(assoc1) != hash(assoc3) 433 - 434 - # Can be used in sets 435 - assoc_set = {assoc1, assoc2, assoc3} 436 - assert len(assoc_set) == 2 # assoc1 and assoc2 are considered the same

+73 -335

uv.lock

··· 1 1 version = 1 2 - revision = 3 2 + revision = 2 3 3 requires-python = ">=3.9" 4 4 resolution-markers = [ 5 5 "python_full_version >= '3.10'", ··· 28 28 sdist = { url = "https://files.pythonhosted.org/packages/95/7d/4c1bd541d4dffa1b52bd83fb8527089e097a106fc90b467a7313b105f840/anyio-4.9.0.tar.gz", hash = "sha256:673c0c244e15788651a4ff38710fea9675823028a6f08a5eda409e0c9840a028", size = 190949, upload-time = "2025-03-17T00:02:54.77Z" } 29 29 wheels = [ 30 30 { url = "https://files.pythonhosted.org/packages/a1/ee/48ca1a7c89ffec8b6a0c5d02b89c305671d5ffd8d3c94acf8b8c408575bb/anyio-4.9.0-py3-none-any.whl", hash = "sha256:9f76d541cad6e36af7beb62e978876f3b41e3e04f2c1fbf0884604c0a9c4d93c", size = 100916, upload-time = "2025-03-17T00:02:52.713Z" }, 31 - ] 32 - 33 - [[package]] 34 - name = "beautifulsoup4" 35 - version = "4.13.4" 36 - source = { registry = "https://pypi.org/simple" } 37 - dependencies = [ 38 - { name = "soupsieve" }, 39 - { name = "typing-extensions" }, 40 - ] 41 - sdist = { url = "https://files.pythonhosted.org/packages/d8/e4/0c4c39e18fd76d6a628d4dd8da40543d136ce2d1752bd6eeeab0791f4d6b/beautifulsoup4-4.13.4.tar.gz", hash = "sha256:dbb3c4e1ceae6aefebdaf2423247260cd062430a410e38c66f2baa50a8437195", size = 621067, upload-time = "2025-04-15T17:05:13.836Z" } 42 - wheels = [ 43 - { url = "https://files.pythonhosted.org/packages/50/cd/30110dc0ffcf3b131156077b90e9f60ed75711223f306da4db08eff8403b/beautifulsoup4-4.13.4-py3-none-any.whl", hash = "sha256:9bbbb14bfde9d79f38b8cd5f8c7c85f4b8f2523190ebed90e950a8dea4cb1c4b", size = 187285, upload-time = "2025-04-15T17:05:12.221Z" }, 44 31 ] 45 32 46 33 [[package]] ··· 104 91 ] 105 92 106 93 [[package]] 107 - name = "charset-normalizer" 108 - version = "3.4.3" 109 - source = { registry = "https://pypi.org/simple" } 110 - sdist = { url = "https://files.pythonhosted.org/packages/83/2d/5fd176ceb9b2fc619e63405525573493ca23441330fcdaee6bef9460e924/charset_normalizer-3.4.3.tar.gz", hash = "sha256:6fce4b8500244f6fcb71465d4a4930d132ba9ab8e71a7859e6a5d59851068d14", size = 122371, upload-time = "2025-08-09T07:57:28.46Z" } 111 - wheels = [ 112 - { url = "https://files.pythonhosted.org/packages/d6/98/f3b8013223728a99b908c9344da3aa04ee6e3fa235f19409033eda92fb78/charset_normalizer-3.4.3-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:fb7f67a1bfa6e40b438170ebdc8158b78dc465a5a67b6dde178a46987b244a72", size = 207695, upload-time = "2025-08-09T07:55:36.452Z" }, 113 - { url = "https://files.pythonhosted.org/packages/21/40/5188be1e3118c82dcb7c2a5ba101b783822cfb413a0268ed3be0468532de/charset_normalizer-3.4.3-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:cc9370a2da1ac13f0153780040f465839e6cccb4a1e44810124b4e22483c93fe", size = 147153, upload-time = "2025-08-09T07:55:38.467Z" }, 114 - { url = "https://files.pythonhosted.org/packages/37/60/5d0d74bc1e1380f0b72c327948d9c2aca14b46a9efd87604e724260f384c/charset_normalizer-3.4.3-cp310-cp310-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:07a0eae9e2787b586e129fdcbe1af6997f8d0e5abaa0bc98c0e20e124d67e601", size = 160428, upload-time = "2025-08-09T07:55:40.072Z" }, 115 - { url = "https://files.pythonhosted.org/packages/85/9a/d891f63722d9158688de58d050c59dc3da560ea7f04f4c53e769de5140f5/charset_normalizer-3.4.3-cp310-cp310-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:74d77e25adda8581ffc1c720f1c81ca082921329452eba58b16233ab1842141c", size = 157627, upload-time = "2025-08-09T07:55:41.706Z" }, 116 - { url = "https://files.pythonhosted.org/packages/65/1a/7425c952944a6521a9cfa7e675343f83fd82085b8af2b1373a2409c683dc/charset_normalizer-3.4.3-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d0e909868420b7049dafd3a31d45125b31143eec59235311fc4c57ea26a4acd2", size = 152388, upload-time = "2025-08-09T07:55:43.262Z" }, 117 - { url = "https://files.pythonhosted.org/packages/f0/c9/a2c9c2a355a8594ce2446085e2ec97fd44d323c684ff32042e2a6b718e1d/charset_normalizer-3.4.3-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:c6f162aabe9a91a309510d74eeb6507fab5fff92337a15acbe77753d88d9dcf0", size = 150077, upload-time = "2025-08-09T07:55:44.903Z" }, 118 - { url = "https://files.pythonhosted.org/packages/3b/38/20a1f44e4851aa1c9105d6e7110c9d020e093dfa5836d712a5f074a12bf7/charset_normalizer-3.4.3-cp310-cp310-musllinux_1_2_ppc64le.whl", hash = "sha256:4ca4c094de7771a98d7fbd67d9e5dbf1eb73efa4f744a730437d8a3a5cf994f0", size = 161631, upload-time = "2025-08-09T07:55:46.346Z" }, 119 - { url = "https://files.pythonhosted.org/packages/a4/fa/384d2c0f57edad03d7bec3ebefb462090d8905b4ff5a2d2525f3bb711fac/charset_normalizer-3.4.3-cp310-cp310-musllinux_1_2_s390x.whl", hash = "sha256:02425242e96bcf29a49711b0ca9f37e451da7c70562bc10e8ed992a5a7a25cc0", size = 159210, upload-time = "2025-08-09T07:55:47.539Z" }, 120 - { url = "https://files.pythonhosted.org/packages/33/9e/eca49d35867ca2db336b6ca27617deed4653b97ebf45dfc21311ce473c37/charset_normalizer-3.4.3-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:78deba4d8f9590fe4dae384aeff04082510a709957e968753ff3c48399f6f92a", size = 153739, upload-time = "2025-08-09T07:55:48.744Z" }, 121 - { url = "https://files.pythonhosted.org/packages/2a/91/26c3036e62dfe8de8061182d33be5025e2424002125c9500faff74a6735e/charset_normalizer-3.4.3-cp310-cp310-win32.whl", hash = "sha256:d79c198e27580c8e958906f803e63cddb77653731be08851c7df0b1a14a8fc0f", size = 99825, upload-time = "2025-08-09T07:55:50.305Z" }, 122 - { url = "https://files.pythonhosted.org/packages/e2/c6/f05db471f81af1fa01839d44ae2a8bfeec8d2a8b4590f16c4e7393afd323/charset_normalizer-3.4.3-cp310-cp310-win_amd64.whl", hash = "sha256:c6e490913a46fa054e03699c70019ab869e990270597018cef1d8562132c2669", size = 107452, upload-time = "2025-08-09T07:55:51.461Z" }, 123 - { url = "https://files.pythonhosted.org/packages/7f/b5/991245018615474a60965a7c9cd2b4efbaabd16d582a5547c47ee1c7730b/charset_normalizer-3.4.3-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:b256ee2e749283ef3ddcff51a675ff43798d92d746d1a6e4631bf8c707d22d0b", size = 204483, upload-time = "2025-08-09T07:55:53.12Z" }, 124 - { url = "https://files.pythonhosted.org/packages/c7/2a/ae245c41c06299ec18262825c1569c5d3298fc920e4ddf56ab011b417efd/charset_normalizer-3.4.3-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:13faeacfe61784e2559e690fc53fa4c5ae97c6fcedb8eb6fb8d0a15b475d2c64", size = 145520, upload-time = "2025-08-09T07:55:54.712Z" }, 125 - { url = "https://files.pythonhosted.org/packages/3a/a4/b3b6c76e7a635748c4421d2b92c7b8f90a432f98bda5082049af37ffc8e3/charset_normalizer-3.4.3-cp311-cp311-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:00237675befef519d9af72169d8604a067d92755e84fe76492fef5441db05b91", size = 158876, upload-time = "2025-08-09T07:55:56.024Z" }, 126 - { url = "https://files.pythonhosted.org/packages/e2/e6/63bb0e10f90a8243c5def74b5b105b3bbbfb3e7bb753915fe333fb0c11ea/charset_normalizer-3.4.3-cp311-cp311-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:585f3b2a80fbd26b048a0be90c5aae8f06605d3c92615911c3a2b03a8a3b796f", size = 156083, upload-time = "2025-08-09T07:55:57.582Z" }, 127 - { url = "https://files.pythonhosted.org/packages/87/df/b7737ff046c974b183ea9aa111b74185ac8c3a326c6262d413bd5a1b8c69/charset_normalizer-3.4.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:0e78314bdc32fa80696f72fa16dc61168fda4d6a0c014e0380f9d02f0e5d8a07", size = 150295, upload-time = "2025-08-09T07:55:59.147Z" }, 128 - { url = "https://files.pythonhosted.org/packages/61/f1/190d9977e0084d3f1dc169acd060d479bbbc71b90bf3e7bf7b9927dec3eb/charset_normalizer-3.4.3-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:96b2b3d1a83ad55310de8c7b4a2d04d9277d5591f40761274856635acc5fcb30", size = 148379, upload-time = "2025-08-09T07:56:00.364Z" }, 129 - { url = "https://files.pythonhosted.org/packages/4c/92/27dbe365d34c68cfe0ca76f1edd70e8705d82b378cb54ebbaeabc2e3029d/charset_normalizer-3.4.3-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:939578d9d8fd4299220161fdd76e86c6a251987476f5243e8864a7844476ba14", size = 160018, upload-time = "2025-08-09T07:56:01.678Z" }, 130 - { url = "https://files.pythonhosted.org/packages/99/04/baae2a1ea1893a01635d475b9261c889a18fd48393634b6270827869fa34/charset_normalizer-3.4.3-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:fd10de089bcdcd1be95a2f73dbe6254798ec1bda9f450d5828c96f93e2536b9c", size = 157430, upload-time = "2025-08-09T07:56:02.87Z" }, 131 - { url = "https://files.pythonhosted.org/packages/2f/36/77da9c6a328c54d17b960c89eccacfab8271fdaaa228305330915b88afa9/charset_normalizer-3.4.3-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:1e8ac75d72fa3775e0b7cb7e4629cec13b7514d928d15ef8ea06bca03ef01cae", size = 151600, upload-time = "2025-08-09T07:56:04.089Z" }, 132 - { url = "https://files.pythonhosted.org/packages/64/d4/9eb4ff2c167edbbf08cdd28e19078bf195762e9bd63371689cab5ecd3d0d/charset_normalizer-3.4.3-cp311-cp311-win32.whl", hash = "sha256:6cf8fd4c04756b6b60146d98cd8a77d0cdae0e1ca20329da2ac85eed779b6849", size = 99616, upload-time = "2025-08-09T07:56:05.658Z" }, 133 - { url = "https://files.pythonhosted.org/packages/f4/9c/996a4a028222e7761a96634d1820de8a744ff4327a00ada9c8942033089b/charset_normalizer-3.4.3-cp311-cp311-win_amd64.whl", hash = "sha256:31a9a6f775f9bcd865d88ee350f0ffb0e25936a7f930ca98995c05abf1faf21c", size = 107108, upload-time = "2025-08-09T07:56:07.176Z" }, 134 - { url = "https://files.pythonhosted.org/packages/e9/5e/14c94999e418d9b87682734589404a25854d5f5d0408df68bc15b6ff54bb/charset_normalizer-3.4.3-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:e28e334d3ff134e88989d90ba04b47d84382a828c061d0d1027b1b12a62b39b1", size = 205655, upload-time = "2025-08-09T07:56:08.475Z" }, 135 - { url = "https://files.pythonhosted.org/packages/7d/a8/c6ec5d389672521f644505a257f50544c074cf5fc292d5390331cd6fc9c3/charset_normalizer-3.4.3-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0cacf8f7297b0c4fcb74227692ca46b4a5852f8f4f24b3c766dd94a1075c4884", size = 146223, upload-time = "2025-08-09T07:56:09.708Z" }, 136 - { url = "https://files.pythonhosted.org/packages/fc/eb/a2ffb08547f4e1e5415fb69eb7db25932c52a52bed371429648db4d84fb1/charset_normalizer-3.4.3-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:c6fd51128a41297f5409deab284fecbe5305ebd7e5a1f959bee1c054622b7018", size = 159366, upload-time = "2025-08-09T07:56:11.326Z" }, 137 - { url = "https://files.pythonhosted.org/packages/82/10/0fd19f20c624b278dddaf83b8464dcddc2456cb4b02bb902a6da126b87a1/charset_normalizer-3.4.3-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:3cfb2aad70f2c6debfbcb717f23b7eb55febc0bb23dcffc0f076009da10c6392", size = 157104, upload-time = "2025-08-09T07:56:13.014Z" }, 138 - { url = "https://files.pythonhosted.org/packages/16/ab/0233c3231af734f5dfcf0844aa9582d5a1466c985bbed6cedab85af9bfe3/charset_normalizer-3.4.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:1606f4a55c0fd363d754049cdf400175ee96c992b1f8018b993941f221221c5f", size = 151830, upload-time = "2025-08-09T07:56:14.428Z" }, 139 - { url = "https://files.pythonhosted.org/packages/ae/02/e29e22b4e02839a0e4a06557b1999d0a47db3567e82989b5bb21f3fbbd9f/charset_normalizer-3.4.3-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:027b776c26d38b7f15b26a5da1044f376455fb3766df8fc38563b4efbc515154", size = 148854, upload-time = "2025-08-09T07:56:16.051Z" }, 140 - { url = "https://files.pythonhosted.org/packages/05/6b/e2539a0a4be302b481e8cafb5af8792da8093b486885a1ae4d15d452bcec/charset_normalizer-3.4.3-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:42e5088973e56e31e4fa58eb6bd709e42fc03799c11c42929592889a2e54c491", size = 160670, upload-time = "2025-08-09T07:56:17.314Z" }, 141 - { url = "https://files.pythonhosted.org/packages/31/e7/883ee5676a2ef217a40ce0bffcc3d0dfbf9e64cbcfbdf822c52981c3304b/charset_normalizer-3.4.3-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:cc34f233c9e71701040d772aa7490318673aa7164a0efe3172b2981218c26d93", size = 158501, upload-time = "2025-08-09T07:56:18.641Z" }, 142 - { url = "https://files.pythonhosted.org/packages/c1/35/6525b21aa0db614cf8b5792d232021dca3df7f90a1944db934efa5d20bb1/charset_normalizer-3.4.3-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:320e8e66157cc4e247d9ddca8e21f427efc7a04bbd0ac8a9faf56583fa543f9f", size = 153173, upload-time = "2025-08-09T07:56:20.289Z" }, 143 - { url = "https://files.pythonhosted.org/packages/50/ee/f4704bad8201de513fdc8aac1cabc87e38c5818c93857140e06e772b5892/charset_normalizer-3.4.3-cp312-cp312-win32.whl", hash = "sha256:fb6fecfd65564f208cbf0fba07f107fb661bcd1a7c389edbced3f7a493f70e37", size = 99822, upload-time = "2025-08-09T07:56:21.551Z" }, 144 - { url = "https://files.pythonhosted.org/packages/39/f5/3b3836ca6064d0992c58c7561c6b6eee1b3892e9665d650c803bd5614522/charset_normalizer-3.4.3-cp312-cp312-win_amd64.whl", hash = "sha256:86df271bf921c2ee3818f0522e9a5b8092ca2ad8b065ece5d7d9d0e9f4849bcc", size = 107543, upload-time = "2025-08-09T07:56:23.115Z" }, 145 - { url = "https://files.pythonhosted.org/packages/65/ca/2135ac97709b400c7654b4b764daf5c5567c2da45a30cdd20f9eefe2d658/charset_normalizer-3.4.3-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:14c2a87c65b351109f6abfc424cab3927b3bdece6f706e4d12faaf3d52ee5efe", size = 205326, upload-time = "2025-08-09T07:56:24.721Z" }, 146 - { url = "https://files.pythonhosted.org/packages/71/11/98a04c3c97dd34e49c7d247083af03645ca3730809a5509443f3c37f7c99/charset_normalizer-3.4.3-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:41d1fc408ff5fdfb910200ec0e74abc40387bccb3252f3f27c0676731df2b2c8", size = 146008, upload-time = "2025-08-09T07:56:26.004Z" }, 147 - { url = "https://files.pythonhosted.org/packages/60/f5/4659a4cb3c4ec146bec80c32d8bb16033752574c20b1252ee842a95d1a1e/charset_normalizer-3.4.3-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:1bb60174149316da1c35fa5233681f7c0f9f514509b8e399ab70fea5f17e45c9", size = 159196, upload-time = "2025-08-09T07:56:27.25Z" }, 148 - { url = "https://files.pythonhosted.org/packages/86/9e/f552f7a00611f168b9a5865a1414179b2c6de8235a4fa40189f6f79a1753/charset_normalizer-3.4.3-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:30d006f98569de3459c2fc1f2acde170b7b2bd265dc1943e87e1a4efe1b67c31", size = 156819, upload-time = "2025-08-09T07:56:28.515Z" }, 149 - { url = "https://files.pythonhosted.org/packages/7e/95/42aa2156235cbc8fa61208aded06ef46111c4d3f0de233107b3f38631803/charset_normalizer-3.4.3-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:416175faf02e4b0810f1f38bcb54682878a4af94059a1cd63b8747244420801f", size = 151350, upload-time = "2025-08-09T07:56:29.716Z" }, 150 - { url = "https://files.pythonhosted.org/packages/c2/a9/3865b02c56f300a6f94fc631ef54f0a8a29da74fb45a773dfd3dcd380af7/charset_normalizer-3.4.3-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:6aab0f181c486f973bc7262a97f5aca3ee7e1437011ef0c2ec04b5a11d16c927", size = 148644, upload-time = "2025-08-09T07:56:30.984Z" }, 151 - { url = "https://files.pythonhosted.org/packages/77/d9/cbcf1a2a5c7d7856f11e7ac2d782aec12bdfea60d104e60e0aa1c97849dc/charset_normalizer-3.4.3-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:fdabf8315679312cfa71302f9bd509ded4f2f263fb5b765cf1433b39106c3cc9", size = 160468, upload-time = "2025-08-09T07:56:32.252Z" }, 152 - { url = "https://files.pythonhosted.org/packages/f6/42/6f45efee8697b89fda4d50580f292b8f7f9306cb2971d4b53f8914e4d890/charset_normalizer-3.4.3-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:bd28b817ea8c70215401f657edef3a8aa83c29d447fb0b622c35403780ba11d5", size = 158187, upload-time = "2025-08-09T07:56:33.481Z" }, 153 - { url = "https://files.pythonhosted.org/packages/70/99/f1c3bdcfaa9c45b3ce96f70b14f070411366fa19549c1d4832c935d8e2c3/charset_normalizer-3.4.3-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:18343b2d246dc6761a249ba1fb13f9ee9a2bcd95decc767319506056ea4ad4dc", size = 152699, upload-time = "2025-08-09T07:56:34.739Z" }, 154 - { url = "https://files.pythonhosted.org/packages/a3/ad/b0081f2f99a4b194bcbb1934ef3b12aa4d9702ced80a37026b7607c72e58/charset_normalizer-3.4.3-cp313-cp313-win32.whl", hash = "sha256:6fb70de56f1859a3f71261cbe41005f56a7842cc348d3aeb26237560bfa5e0ce", size = 99580, upload-time = "2025-08-09T07:56:35.981Z" }, 155 - { url = "https://files.pythonhosted.org/packages/9a/8f/ae790790c7b64f925e5c953b924aaa42a243fb778fed9e41f147b2a5715a/charset_normalizer-3.4.3-cp313-cp313-win_amd64.whl", hash = "sha256:cf1ebb7d78e1ad8ec2a8c4732c7be2e736f6e5123a4146c5b89c9d1f585f8cef", size = 107366, upload-time = "2025-08-09T07:56:37.339Z" }, 156 - { url = "https://files.pythonhosted.org/packages/8e/91/b5a06ad970ddc7a0e513112d40113e834638f4ca1120eb727a249fb2715e/charset_normalizer-3.4.3-cp314-cp314-macosx_10_13_universal2.whl", hash = "sha256:3cd35b7e8aedeb9e34c41385fda4f73ba609e561faedfae0a9e75e44ac558a15", size = 204342, upload-time = "2025-08-09T07:56:38.687Z" }, 157 - { url = "https://files.pythonhosted.org/packages/ce/ec/1edc30a377f0a02689342f214455c3f6c2fbedd896a1d2f856c002fc3062/charset_normalizer-3.4.3-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b89bc04de1d83006373429975f8ef9e7932534b8cc9ca582e4db7d20d91816db", size = 145995, upload-time = "2025-08-09T07:56:40.048Z" }, 158 - { url = "https://files.pythonhosted.org/packages/17/e5/5e67ab85e6d22b04641acb5399c8684f4d37caf7558a53859f0283a650e9/charset_normalizer-3.4.3-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:2001a39612b241dae17b4687898843f254f8748b796a2e16f1051a17078d991d", size = 158640, upload-time = "2025-08-09T07:56:41.311Z" }, 159 - { url = "https://files.pythonhosted.org/packages/f1/e5/38421987f6c697ee3722981289d554957c4be652f963d71c5e46a262e135/charset_normalizer-3.4.3-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:8dcfc373f888e4fb39a7bc57e93e3b845e7f462dacc008d9749568b1c4ece096", size = 156636, upload-time = "2025-08-09T07:56:43.195Z" }, 160 - { url = "https://files.pythonhosted.org/packages/a0/e4/5a075de8daa3ec0745a9a3b54467e0c2967daaaf2cec04c845f73493e9a1/charset_normalizer-3.4.3-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:18b97b8404387b96cdbd30ad660f6407799126d26a39ca65729162fd810a99aa", size = 150939, upload-time = "2025-08-09T07:56:44.819Z" }, 161 - { url = "https://files.pythonhosted.org/packages/02/f7/3611b32318b30974131db62b4043f335861d4d9b49adc6d57c1149cc49d4/charset_normalizer-3.4.3-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:ccf600859c183d70eb47e05a44cd80a4ce77394d1ac0f79dbd2dd90a69a3a049", size = 148580, upload-time = "2025-08-09T07:56:46.684Z" }, 162 - { url = "https://files.pythonhosted.org/packages/7e/61/19b36f4bd67f2793ab6a99b979b4e4f3d8fc754cbdffb805335df4337126/charset_normalizer-3.4.3-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:53cd68b185d98dde4ad8990e56a58dea83a4162161b1ea9272e5c9182ce415e0", size = 159870, upload-time = "2025-08-09T07:56:47.941Z" }, 163 - { url = "https://files.pythonhosted.org/packages/06/57/84722eefdd338c04cf3030ada66889298eaedf3e7a30a624201e0cbe424a/charset_normalizer-3.4.3-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:30a96e1e1f865f78b030d65241c1ee850cdf422d869e9028e2fc1d5e4db73b92", size = 157797, upload-time = "2025-08-09T07:56:49.756Z" }, 164 - { url = "https://files.pythonhosted.org/packages/72/2a/aff5dd112b2f14bcc3462c312dce5445806bfc8ab3a7328555da95330e4b/charset_normalizer-3.4.3-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:d716a916938e03231e86e43782ca7878fb602a125a91e7acb8b5112e2e96ac16", size = 152224, upload-time = "2025-08-09T07:56:51.369Z" }, 165 - { url = "https://files.pythonhosted.org/packages/b7/8c/9839225320046ed279c6e839d51f028342eb77c91c89b8ef2549f951f3ec/charset_normalizer-3.4.3-cp314-cp314-win32.whl", hash = "sha256:c6dbd0ccdda3a2ba7c2ecd9d77b37f3b5831687d8dc1b6ca5f56a4880cc7b7ce", size = 100086, upload-time = "2025-08-09T07:56:52.722Z" }, 166 - { url = "https://files.pythonhosted.org/packages/ee/7a/36fbcf646e41f710ce0a563c1c9a343c6edf9be80786edeb15b6f62e17db/charset_normalizer-3.4.3-cp314-cp314-win_amd64.whl", hash = "sha256:73dc19b562516fc9bcf6e5d6e596df0b4eb98d87e4f79f3ae71840e6ed21361c", size = 107400, upload-time = "2025-08-09T07:56:55.172Z" }, 167 - { url = "https://files.pythonhosted.org/packages/c2/ca/9a0983dd5c8e9733565cf3db4df2b0a2e9a82659fd8aa2a868ac6e4a991f/charset_normalizer-3.4.3-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:70bfc5f2c318afece2f5838ea5e4c3febada0be750fcf4775641052bbba14d05", size = 207520, upload-time = "2025-08-09T07:57:11.026Z" }, 168 - { url = "https://files.pythonhosted.org/packages/39/c6/99271dc37243a4f925b09090493fb96c9333d7992c6187f5cfe5312008d2/charset_normalizer-3.4.3-cp39-cp39-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:23b6b24d74478dc833444cbd927c338349d6ae852ba53a0d02a2de1fce45b96e", size = 147307, upload-time = "2025-08-09T07:57:12.4Z" }, 169 - { url = "https://files.pythonhosted.org/packages/e4/69/132eab043356bba06eb333cc2cc60c6340857d0a2e4ca6dc2b51312886b3/charset_normalizer-3.4.3-cp39-cp39-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:34a7f768e3f985abdb42841e20e17b330ad3aaf4bb7e7aeeb73db2e70f077b99", size = 160448, upload-time = "2025-08-09T07:57:13.712Z" }, 170 - { url = "https://files.pythonhosted.org/packages/04/9a/914d294daa4809c57667b77470533e65def9c0be1ef8b4c1183a99170e9d/charset_normalizer-3.4.3-cp39-cp39-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:fb731e5deb0c7ef82d698b0f4c5bb724633ee2a489401594c5c88b02e6cb15f7", size = 157758, upload-time = "2025-08-09T07:57:14.979Z" }, 171 - { url = "https://files.pythonhosted.org/packages/b0/a8/6f5bcf1bcf63cb45625f7c5cadca026121ff8a6c8a3256d8d8cd59302663/charset_normalizer-3.4.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:257f26fed7d7ff59921b78244f3cd93ed2af1800ff048c33f624c87475819dd7", size = 152487, upload-time = "2025-08-09T07:57:16.332Z" }, 172 - { url = "https://files.pythonhosted.org/packages/c4/72/d3d0e9592f4e504f9dea08b8db270821c909558c353dc3b457ed2509f2fb/charset_normalizer-3.4.3-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:1ef99f0456d3d46a50945c98de1774da86f8e992ab5c77865ea8b8195341fc19", size = 150054, upload-time = "2025-08-09T07:57:17.576Z" }, 173 - { url = "https://files.pythonhosted.org/packages/20/30/5f64fe3981677fe63fa987b80e6c01042eb5ff653ff7cec1b7bd9268e54e/charset_normalizer-3.4.3-cp39-cp39-musllinux_1_2_ppc64le.whl", hash = "sha256:2c322db9c8c89009a990ef07c3bcc9f011a3269bc06782f916cd3d9eed7c9312", size = 161703, upload-time = "2025-08-09T07:57:20.012Z" }, 174 - { url = "https://files.pythonhosted.org/packages/e1/ef/dd08b2cac9284fd59e70f7d97382c33a3d0a926e45b15fc21b3308324ffd/charset_normalizer-3.4.3-cp39-cp39-musllinux_1_2_s390x.whl", hash = "sha256:511729f456829ef86ac41ca78c63a5cb55240ed23b4b737faca0eb1abb1c41bc", size = 159096, upload-time = "2025-08-09T07:57:21.329Z" }, 175 - { url = "https://files.pythonhosted.org/packages/45/8c/dcef87cfc2b3f002a6478f38906f9040302c68aebe21468090e39cde1445/charset_normalizer-3.4.3-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:88ab34806dea0671532d3f82d82b85e8fc23d7b2dd12fa837978dad9bb392a34", size = 153852, upload-time = "2025-08-09T07:57:22.608Z" }, 176 - { url = "https://files.pythonhosted.org/packages/63/86/9cbd533bd37883d467fcd1bd491b3547a3532d0fbb46de2b99feeebf185e/charset_normalizer-3.4.3-cp39-cp39-win32.whl", hash = "sha256:16a8770207946ac75703458e2c743631c79c59c5890c80011d536248f8eaa432", size = 99840, upload-time = "2025-08-09T07:57:23.883Z" }, 177 - { url = "https://files.pythonhosted.org/packages/ce/d6/7e805c8e5c46ff9729c49950acc4ee0aeb55efb8b3a56687658ad10c3216/charset_normalizer-3.4.3-cp39-cp39-win_amd64.whl", hash = "sha256:d22dbedd33326a4a5190dd4fe9e9e693ef12160c77382d9e87919bce54f3d4ca", size = 107438, upload-time = "2025-08-09T07:57:25.287Z" }, 178 - { url = "https://files.pythonhosted.org/packages/8a/1f/f041989e93b001bc4e44bb1669ccdcf54d3f00e628229a85b08d330615c5/charset_normalizer-3.4.3-py3-none-any.whl", hash = "sha256:ce571ab16d890d23b5c278547ba694193a45011ff86a9162a71307ed9f86759a", size = 53175, upload-time = "2025-08-09T07:57:26.864Z" }, 179 - ] 180 - 181 - [[package]] 182 94 name = "click" 183 95 version = "8.1.8" 184 96 source = { registry = "https://pypi.org/simple" } ··· 297 209 ] 298 210 299 211 [[package]] 300 - name = "distro" 301 - version = "1.9.0" 302 - source = { registry = "https://pypi.org/simple" } 303 - sdist = { url = "https://files.pythonhosted.org/packages/fc/f8/98eea607f65de6527f8a2e8885fc8015d3e6f5775df186e443e0964a11c3/distro-1.9.0.tar.gz", hash = "sha256:2fa77c6fd8940f116ee1d6b94a2f90b13b5ea8d019b98bc8bafdcabcdd9bdbed", size = 60722, upload-time = "2023-12-24T09:54:32.31Z" } 304 - wheels = [ 305 - { url = "https://files.pythonhosted.org/packages/12/b3/231ffd4ab1fc9d679809f356cebee130ac7daa00d6d6f3206dd4fd137e9e/distro-1.9.0-py3-none-any.whl", hash = "sha256:7bffd925d65168f85027d8da9af6bddab658135b840670a223589bc0c8ef02b2", size = 20277, upload-time = "2023-12-24T09:54:30.421Z" }, 306 - ] 307 - 308 - [[package]] 309 212 name = "dnspython" 310 213 version = "2.7.0" 311 214 source = { registry = "https://pypi.org/simple" } ··· 385 288 ] 386 289 387 290 [[package]] 388 - name = "html2text" 389 - version = "2025.4.15" 390 - source = { registry = "https://pypi.org/simple" } 391 - sdist = { url = "https://files.pythonhosted.org/packages/f8/27/e158d86ba1e82967cc2f790b0cb02030d4a8bef58e0c79a8590e9678107f/html2text-2025.4.15.tar.gz", hash = "sha256:948a645f8f0bc3abe7fd587019a2197a12436cd73d0d4908af95bfc8da337588", size = 64316, upload-time = "2025-04-15T04:02:30.045Z" } 392 - wheels = [ 393 - { url = "https://files.pythonhosted.org/packages/1d/84/1a0f9555fd5f2b1c924ff932d99b40a0f8a6b12f6dd625e2a47f415b00ea/html2text-2025.4.15-py3-none-any.whl", hash = "sha256:00569167ffdab3d7767a4cdf589b7f57e777a5ed28d12907d8c58769ec734acc", size = 34656, upload-time = "2025-04-15T04:02:28.44Z" }, 394 - ] 395 - 396 - [[package]] 397 291 name = "httpcore" 398 292 version = "1.0.9" 399 293 source = { registry = "https://pypi.org/simple" } ··· 431 325 ] 432 326 433 327 [[package]] 434 - name = "importlib-metadata" 435 - version = "8.7.0" 436 - source = { registry = "https://pypi.org/simple" } 437 - dependencies = [ 438 - { name = "zipp" }, 439 - ] 440 - sdist = { url = "https://files.pythonhosted.org/packages/76/66/650a33bd90f786193e4de4b3ad86ea60b53c89b669a5c7be931fac31cdb0/importlib_metadata-8.7.0.tar.gz", hash = "sha256:d13b81ad223b890aa16c5471f2ac3056cf76c5f10f82d6f9292f0b415f389000", size = 56641, upload-time = "2025-04-27T15:29:01.736Z" } 441 - wheels = [ 442 - { url = "https://files.pythonhosted.org/packages/20/b0/36bd937216ec521246249be3bf9855081de4c5e06a0c9b4219dbeda50373/importlib_metadata-8.7.0-py3-none-any.whl", hash = "sha256:e5dd1551894c77868a30651cef00984d50e1002d06942a7101d34870c5f02afd", size = 27656, upload-time = "2025-04-27T15:29:00.214Z" }, 443 - ] 444 - 445 - [[package]] 446 328 name = "iniconfig" 447 329 version = "2.1.0" 448 330 source = { registry = "https://pypi.org/simple" } ··· 452 334 ] 453 335 454 336 [[package]] 455 - name = "lxml" 456 - version = "6.0.0" 337 + name = "jinja2" 338 + version = "3.1.6" 457 339 source = { registry = "https://pypi.org/simple" } 458 - sdist = { url = "https://files.pythonhosted.org/packages/c5/ed/60eb6fa2923602fba988d9ca7c5cdbd7cf25faa795162ed538b527a35411/lxml-6.0.0.tar.gz", hash = "sha256:032e65120339d44cdc3efc326c9f660f5f7205f3a535c1fdbf898b29ea01fb72", size = 4096938, upload-time = "2025-06-26T16:28:19.373Z" } 340 + dependencies = [ 341 + { name = "markupsafe" }, 342 + ] 343 + sdist = { url = "https://files.pythonhosted.org/packages/df/bf/f7da0350254c0ed7c72f3e33cef02e048281fec7ecec5f032d4aac52226b/jinja2-3.1.6.tar.gz", hash = "sha256:0137fb05990d35f1275a587e9aee6d56da821fc83491a0fb838183be43f66d6d", size = 245115, upload-time = "2025-03-05T20:05:02.478Z" } 459 344 wheels = [ 460 - { url = "https://files.pythonhosted.org/packages/4b/e9/9c3ca02fbbb7585116c2e274b354a2d92b5c70561687dd733ec7b2018490/lxml-6.0.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:35bc626eec405f745199200ccb5c6b36f202675d204aa29bb52e27ba2b71dea8", size = 8399057, upload-time = "2025-06-26T16:25:02.169Z" }, 461 - { url = "https://files.pythonhosted.org/packages/86/25/10a6e9001191854bf283515020f3633b1b1f96fd1b39aa30bf8fff7aa666/lxml-6.0.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:246b40f8a4aec341cbbf52617cad8ab7c888d944bfe12a6abd2b1f6cfb6f6082", size = 4569676, upload-time = "2025-06-26T16:25:05.431Z" }, 462 - { url = "https://files.pythonhosted.org/packages/f5/a5/378033415ff61d9175c81de23e7ad20a3ffb614df4ffc2ffc86bc6746ffd/lxml-6.0.0-cp310-cp310-manylinux2010_i686.manylinux2014_i686.manylinux_2_12_i686.manylinux_2_17_i686.whl", hash = "sha256:2793a627e95d119e9f1e19720730472f5543a6d84c50ea33313ce328d870f2dd", size = 5291361, upload-time = "2025-06-26T16:25:07.901Z" }, 463 - { url = "https://files.pythonhosted.org/packages/5a/a6/19c87c4f3b9362b08dc5452a3c3bce528130ac9105fc8fff97ce895ce62e/lxml-6.0.0-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:46b9ed911f36bfeb6338e0b482e7fe7c27d362c52fde29f221fddbc9ee2227e7", size = 5008290, upload-time = "2025-06-28T18:47:13.196Z" }, 464 - { url = "https://files.pythonhosted.org/packages/09/d1/e9b7ad4b4164d359c4d87ed8c49cb69b443225cb495777e75be0478da5d5/lxml-6.0.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:2b4790b558bee331a933e08883c423f65bbcd07e278f91b2272489e31ab1e2b4", size = 5163192, upload-time = "2025-06-28T18:47:17.279Z" }, 465 - { url = "https://files.pythonhosted.org/packages/56/d6/b3eba234dc1584744b0b374a7f6c26ceee5dc2147369a7e7526e25a72332/lxml-6.0.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e2030956cf4886b10be9a0285c6802e078ec2391e1dd7ff3eb509c2c95a69b76", size = 5076973, upload-time = "2025-06-26T16:25:10.936Z" }, 466 - { url = "https://files.pythonhosted.org/packages/8e/47/897142dd9385dcc1925acec0c4afe14cc16d310ce02c41fcd9010ac5d15d/lxml-6.0.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4d23854ecf381ab1facc8f353dcd9adeddef3652268ee75297c1164c987c11dc", size = 5297795, upload-time = "2025-06-26T16:25:14.282Z" }, 467 - { url = "https://files.pythonhosted.org/packages/fb/db/551ad84515c6f415cea70193a0ff11d70210174dc0563219f4ce711655c6/lxml-6.0.0-cp310-cp310-manylinux_2_31_armv7l.whl", hash = "sha256:43fe5af2d590bf4691531b1d9a2495d7aab2090547eaacd224a3afec95706d76", size = 4776547, upload-time = "2025-06-26T16:25:17.123Z" }, 468 - { url = "https://files.pythonhosted.org/packages/e0/14/c4a77ab4f89aaf35037a03c472f1ccc54147191888626079bd05babd6808/lxml-6.0.0-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:74e748012f8c19b47f7d6321ac929a9a94ee92ef12bc4298c47e8b7219b26541", size = 5124904, upload-time = "2025-06-26T16:25:19.485Z" }, 469 - { url = "https://files.pythonhosted.org/packages/70/b4/12ae6a51b8da106adec6a2e9c60f532350a24ce954622367f39269e509b1/lxml-6.0.0-cp310-cp310-musllinux_1_2_armv7l.whl", hash = "sha256:43cfbb7db02b30ad3926e8fceaef260ba2fb7df787e38fa2df890c1ca7966c3b", size = 4805804, upload-time = "2025-06-26T16:25:21.949Z" }, 470 - { url = "https://files.pythonhosted.org/packages/a9/b6/2e82d34d49f6219cdcb6e3e03837ca5fb8b7f86c2f35106fb8610ac7f5b8/lxml-6.0.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:34190a1ec4f1e84af256495436b2d196529c3f2094f0af80202947567fdbf2e7", size = 5323477, upload-time = "2025-06-26T16:25:24.475Z" }, 471 - { url = "https://files.pythonhosted.org/packages/a1/e6/b83ddc903b05cd08a5723fefd528eee84b0edd07bdf87f6c53a1fda841fd/lxml-6.0.0-cp310-cp310-win32.whl", hash = "sha256:5967fe415b1920a3877a4195e9a2b779249630ee49ece22021c690320ff07452", size = 3613840, upload-time = "2025-06-26T16:25:27.345Z" }, 472 - { url = "https://files.pythonhosted.org/packages/40/af/874fb368dd0c663c030acb92612341005e52e281a102b72a4c96f42942e1/lxml-6.0.0-cp310-cp310-win_amd64.whl", hash = "sha256:f3389924581d9a770c6caa4df4e74b606180869043b9073e2cec324bad6e306e", size = 3993584, upload-time = "2025-06-26T16:25:29.391Z" }, 473 - { url = "https://files.pythonhosted.org/packages/4a/f4/d296bc22c17d5607653008f6dd7b46afdfda12efd31021705b507df652bb/lxml-6.0.0-cp310-cp310-win_arm64.whl", hash = "sha256:522fe7abb41309e9543b0d9b8b434f2b630c5fdaf6482bee642b34c8c70079c8", size = 3681400, upload-time = "2025-06-26T16:25:31.421Z" }, 474 - { url = "https://files.pythonhosted.org/packages/7c/23/828d4cc7da96c611ec0ce6147bbcea2fdbde023dc995a165afa512399bbf/lxml-6.0.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:4ee56288d0df919e4aac43b539dd0e34bb55d6a12a6562038e8d6f3ed07f9e36", size = 8438217, upload-time = "2025-06-26T16:25:34.349Z" }, 475 - { url = "https://files.pythonhosted.org/packages/f1/33/5ac521212c5bcb097d573145d54b2b4a3c9766cda88af5a0e91f66037c6e/lxml-6.0.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:b8dd6dd0e9c1992613ccda2bcb74fc9d49159dbe0f0ca4753f37527749885c25", size = 4590317, upload-time = "2025-06-26T16:25:38.103Z" }, 476 - { url = "https://files.pythonhosted.org/packages/2b/2e/45b7ca8bee304c07f54933c37afe7dd4d39ff61ba2757f519dcc71bc5d44/lxml-6.0.0-cp311-cp311-manylinux2010_i686.manylinux2014_i686.manylinux_2_12_i686.manylinux_2_17_i686.whl", hash = "sha256:d7ae472f74afcc47320238b5dbfd363aba111a525943c8a34a1b657c6be934c3", size = 5221628, upload-time = "2025-06-26T16:25:40.878Z" }, 477 - { url = "https://files.pythonhosted.org/packages/32/23/526d19f7eb2b85da1f62cffb2556f647b049ebe2a5aa8d4d41b1fb2c7d36/lxml-6.0.0-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5592401cdf3dc682194727c1ddaa8aa0f3ddc57ca64fd03226a430b955eab6f6", size = 4949429, upload-time = "2025-06-28T18:47:20.046Z" }, 478 - { url = "https://files.pythonhosted.org/packages/ac/cc/f6be27a5c656a43a5344e064d9ae004d4dcb1d3c9d4f323c8189ddfe4d13/lxml-6.0.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:58ffd35bd5425c3c3b9692d078bf7ab851441434531a7e517c4984d5634cd65b", size = 5087909, upload-time = "2025-06-28T18:47:22.834Z" }, 479 - { url = "https://files.pythonhosted.org/packages/3b/e6/8ec91b5bfbe6972458bc105aeb42088e50e4b23777170404aab5dfb0c62d/lxml-6.0.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:f720a14aa102a38907c6d5030e3d66b3b680c3e6f6bc95473931ea3c00c59967", size = 5031713, upload-time = "2025-06-26T16:25:43.226Z" }, 480 - { url = "https://files.pythonhosted.org/packages/33/cf/05e78e613840a40e5be3e40d892c48ad3e475804db23d4bad751b8cadb9b/lxml-6.0.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c2a5e8d207311a0170aca0eb6b160af91adc29ec121832e4ac151a57743a1e1e", size = 5232417, upload-time = "2025-06-26T16:25:46.111Z" }, 481 - { url = "https://files.pythonhosted.org/packages/ac/8c/6b306b3e35c59d5f0b32e3b9b6b3b0739b32c0dc42a295415ba111e76495/lxml-6.0.0-cp311-cp311-manylinux_2_31_armv7l.whl", hash = "sha256:2dd1cc3ea7e60bfb31ff32cafe07e24839df573a5e7c2d33304082a5019bcd58", size = 4681443, upload-time = "2025-06-26T16:25:48.837Z" }, 482 - { url = "https://files.pythonhosted.org/packages/59/43/0bd96bece5f7eea14b7220476835a60d2b27f8e9ca99c175f37c085cb154/lxml-6.0.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:2cfcf84f1defed7e5798ef4f88aa25fcc52d279be731ce904789aa7ccfb7e8d2", size = 5074542, upload-time = "2025-06-26T16:25:51.65Z" }, 483 - { url = "https://files.pythonhosted.org/packages/e2/3d/32103036287a8ca012d8518071f8852c68f2b3bfe048cef2a0202eb05910/lxml-6.0.0-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:a52a4704811e2623b0324a18d41ad4b9fabf43ce5ff99b14e40a520e2190c851", size = 4729471, upload-time = "2025-06-26T16:25:54.571Z" }, 484 - { url = "https://files.pythonhosted.org/packages/ca/a8/7be5d17df12d637d81854bd8648cd329f29640a61e9a72a3f77add4a311b/lxml-6.0.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:c16304bba98f48a28ae10e32a8e75c349dd742c45156f297e16eeb1ba9287a1f", size = 5256285, upload-time = "2025-06-26T16:25:56.997Z" }, 485 - { url = "https://files.pythonhosted.org/packages/cd/d0/6cb96174c25e0d749932557c8d51d60c6e292c877b46fae616afa23ed31a/lxml-6.0.0-cp311-cp311-win32.whl", hash = "sha256:f8d19565ae3eb956d84da3ef367aa7def14a2735d05bd275cd54c0301f0d0d6c", size = 3612004, upload-time = "2025-06-26T16:25:59.11Z" }, 486 - { url = "https://files.pythonhosted.org/packages/ca/77/6ad43b165dfc6dead001410adeb45e88597b25185f4479b7ca3b16a5808f/lxml-6.0.0-cp311-cp311-win_amd64.whl", hash = "sha256:b2d71cdefda9424adff9a3607ba5bbfc60ee972d73c21c7e3c19e71037574816", size = 4003470, upload-time = "2025-06-26T16:26:01.655Z" }, 487 - { url = "https://files.pythonhosted.org/packages/a0/bc/4c50ec0eb14f932a18efc34fc86ee936a66c0eb5f2fe065744a2da8a68b2/lxml-6.0.0-cp311-cp311-win_arm64.whl", hash = "sha256:8a2e76efbf8772add72d002d67a4c3d0958638696f541734304c7f28217a9cab", size = 3682477, upload-time = "2025-06-26T16:26:03.808Z" }, 488 - { url = "https://files.pythonhosted.org/packages/89/c3/d01d735c298d7e0ddcedf6f028bf556577e5ab4f4da45175ecd909c79378/lxml-6.0.0-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:78718d8454a6e928470d511bf8ac93f469283a45c354995f7d19e77292f26108", size = 8429515, upload-time = "2025-06-26T16:26:06.776Z" }, 489 - { url = "https://files.pythonhosted.org/packages/06/37/0e3eae3043d366b73da55a86274a590bae76dc45aa004b7042e6f97803b1/lxml-6.0.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:84ef591495ffd3f9dcabffd6391db7bb70d7230b5c35ef5148354a134f56f2be", size = 4601387, upload-time = "2025-06-26T16:26:09.511Z" }, 490 - { url = "https://files.pythonhosted.org/packages/a3/28/e1a9a881e6d6e29dda13d633885d13acb0058f65e95da67841c8dd02b4a8/lxml-6.0.0-cp312-cp312-manylinux2010_i686.manylinux2014_i686.manylinux_2_12_i686.manylinux_2_17_i686.whl", hash = "sha256:2930aa001a3776c3e2601cb8e0a15d21b8270528d89cc308be4843ade546b9ab", size = 5228928, upload-time = "2025-06-26T16:26:12.337Z" }, 491 - { url = "https://files.pythonhosted.org/packages/9a/55/2cb24ea48aa30c99f805921c1c7860c1f45c0e811e44ee4e6a155668de06/lxml-6.0.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:219e0431ea8006e15005767f0351e3f7f9143e793e58519dc97fe9e07fae5563", size = 4952289, upload-time = "2025-06-28T18:47:25.602Z" }, 492 - { url = "https://files.pythonhosted.org/packages/31/c0/b25d9528df296b9a3306ba21ff982fc5b698c45ab78b94d18c2d6ae71fd9/lxml-6.0.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:bd5913b4972681ffc9718bc2d4c53cde39ef81415e1671ff93e9aa30b46595e7", size = 5111310, upload-time = "2025-06-28T18:47:28.136Z" }, 493 - { url = "https://files.pythonhosted.org/packages/e9/af/681a8b3e4f668bea6e6514cbcb297beb6de2b641e70f09d3d78655f4f44c/lxml-6.0.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:390240baeb9f415a82eefc2e13285016f9c8b5ad71ec80574ae8fa9605093cd7", size = 5025457, upload-time = "2025-06-26T16:26:15.068Z" }, 494 - { url = "https://files.pythonhosted.org/packages/99/b6/3a7971aa05b7be7dfebc7ab57262ec527775c2c3c5b2f43675cac0458cad/lxml-6.0.0-cp312-cp312-manylinux_2_27_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:d6e200909a119626744dd81bae409fc44134389e03fbf1d68ed2a55a2fb10991", size = 5657016, upload-time = "2025-07-03T19:19:06.008Z" }, 495 - { url = "https://files.pythonhosted.org/packages/69/f8/693b1a10a891197143c0673fcce5b75fc69132afa81a36e4568c12c8faba/lxml-6.0.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ca50bd612438258a91b5b3788c6621c1f05c8c478e7951899f492be42defc0da", size = 5257565, upload-time = "2025-06-26T16:26:17.906Z" }, 496 - { url = "https://files.pythonhosted.org/packages/a8/96/e08ff98f2c6426c98c8964513c5dab8d6eb81dadcd0af6f0c538ada78d33/lxml-6.0.0-cp312-cp312-manylinux_2_31_armv7l.whl", hash = "sha256:c24b8efd9c0f62bad0439283c2c795ef916c5a6b75f03c17799775c7ae3c0c9e", size = 4713390, upload-time = "2025-06-26T16:26:20.292Z" }, 497 - { url = "https://files.pythonhosted.org/packages/a8/83/6184aba6cc94d7413959f6f8f54807dc318fdcd4985c347fe3ea6937f772/lxml-6.0.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:afd27d8629ae94c5d863e32ab0e1d5590371d296b87dae0a751fb22bf3685741", size = 5066103, upload-time = "2025-06-26T16:26:22.765Z" }, 498 - { url = "https://files.pythonhosted.org/packages/ee/01/8bf1f4035852d0ff2e36a4d9aacdbcc57e93a6cd35a54e05fa984cdf73ab/lxml-6.0.0-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:54c4855eabd9fc29707d30141be99e5cd1102e7d2258d2892314cf4c110726c3", size = 4791428, upload-time = "2025-06-26T16:26:26.461Z" }, 499 - { url = "https://files.pythonhosted.org/packages/29/31/c0267d03b16954a85ed6b065116b621d37f559553d9339c7dcc4943a76f1/lxml-6.0.0-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:c907516d49f77f6cd8ead1322198bdfd902003c3c330c77a1c5f3cc32a0e4d16", size = 5678523, upload-time = "2025-07-03T19:19:09.837Z" }, 500 - { url = "https://files.pythonhosted.org/packages/5c/f7/5495829a864bc5f8b0798d2b52a807c89966523140f3d6fa3a58ab6720ea/lxml-6.0.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:36531f81c8214e293097cd2b7873f178997dae33d3667caaae8bdfb9666b76c0", size = 5281290, upload-time = "2025-06-26T16:26:29.406Z" }, 501 - { url = "https://files.pythonhosted.org/packages/79/56/6b8edb79d9ed294ccc4e881f4db1023af56ba451909b9ce79f2a2cd7c532/lxml-6.0.0-cp312-cp312-win32.whl", hash = "sha256:690b20e3388a7ec98e899fd54c924e50ba6693874aa65ef9cb53de7f7de9d64a", size = 3613495, upload-time = "2025-06-26T16:26:31.588Z" }, 502 - { url = "https://files.pythonhosted.org/packages/0b/1e/cc32034b40ad6af80b6fd9b66301fc0f180f300002e5c3eb5a6110a93317/lxml-6.0.0-cp312-cp312-win_amd64.whl", hash = "sha256:310b719b695b3dd442cdfbbe64936b2f2e231bb91d998e99e6f0daf991a3eba3", size = 4014711, upload-time = "2025-06-26T16:26:33.723Z" }, 503 - { url = "https://files.pythonhosted.org/packages/55/10/dc8e5290ae4c94bdc1a4c55865be7e1f31dfd857a88b21cbba68b5fea61b/lxml-6.0.0-cp312-cp312-win_arm64.whl", hash = "sha256:8cb26f51c82d77483cdcd2b4a53cda55bbee29b3c2f3ddeb47182a2a9064e4eb", size = 3674431, upload-time = "2025-06-26T16:26:35.959Z" }, 504 - { url = "https://files.pythonhosted.org/packages/79/21/6e7c060822a3c954ff085e5e1b94b4a25757c06529eac91e550f3f5cd8b8/lxml-6.0.0-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:6da7cd4f405fd7db56e51e96bff0865b9853ae70df0e6720624049da76bde2da", size = 8414372, upload-time = "2025-06-26T16:26:39.079Z" }, 505 - { url = "https://files.pythonhosted.org/packages/a4/f6/051b1607a459db670fc3a244fa4f06f101a8adf86cda263d1a56b3a4f9d5/lxml-6.0.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:b34339898bb556a2351a1830f88f751679f343eabf9cf05841c95b165152c9e7", size = 4593940, upload-time = "2025-06-26T16:26:41.891Z" }, 506 - { url = "https://files.pythonhosted.org/packages/8e/74/dd595d92a40bda3c687d70d4487b2c7eff93fd63b568acd64fedd2ba00fe/lxml-6.0.0-cp313-cp313-manylinux2010_i686.manylinux2014_i686.manylinux_2_12_i686.manylinux_2_17_i686.whl", hash = "sha256:51a5e4c61a4541bd1cd3ba74766d0c9b6c12d6a1a4964ef60026832aac8e79b3", size = 5214329, upload-time = "2025-06-26T16:26:44.669Z" }, 507 - { url = "https://files.pythonhosted.org/packages/52/46/3572761efc1bd45fcafb44a63b3b0feeb5b3f0066886821e94b0254f9253/lxml-6.0.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:d18a25b19ca7307045581b18b3ec9ead2b1db5ccd8719c291f0cd0a5cec6cb81", size = 4947559, upload-time = "2025-06-28T18:47:31.091Z" }, 508 - { url = "https://files.pythonhosted.org/packages/94/8a/5e40de920e67c4f2eef9151097deb9b52d86c95762d8ee238134aff2125d/lxml-6.0.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:d4f0c66df4386b75d2ab1e20a489f30dc7fd9a06a896d64980541506086be1f1", size = 5102143, upload-time = "2025-06-28T18:47:33.612Z" }, 509 - { url = "https://files.pythonhosted.org/packages/7c/4b/20555bdd75d57945bdabfbc45fdb1a36a1a0ff9eae4653e951b2b79c9209/lxml-6.0.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9f4b481b6cc3a897adb4279216695150bbe7a44c03daba3c894f49d2037e0a24", size = 5021931, upload-time = "2025-06-26T16:26:47.503Z" }, 510 - { url = "https://files.pythonhosted.org/packages/b6/6e/cf03b412f3763d4ca23b25e70c96a74cfece64cec3addf1c4ec639586b13/lxml-6.0.0-cp313-cp313-manylinux_2_27_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:8a78d6c9168f5bcb20971bf3329c2b83078611fbe1f807baadc64afc70523b3a", size = 5645469, upload-time = "2025-07-03T19:19:13.32Z" }, 511 - { url = "https://files.pythonhosted.org/packages/d4/dd/39c8507c16db6031f8c1ddf70ed95dbb0a6d466a40002a3522c128aba472/lxml-6.0.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2ae06fbab4f1bb7db4f7c8ca9897dc8db4447d1a2b9bee78474ad403437bcc29", size = 5247467, upload-time = "2025-06-26T16:26:49.998Z" }, 512 - { url = "https://files.pythonhosted.org/packages/4d/56/732d49def0631ad633844cfb2664563c830173a98d5efd9b172e89a4800d/lxml-6.0.0-cp313-cp313-manylinux_2_31_armv7l.whl", hash = "sha256:1fa377b827ca2023244a06554c6e7dc6828a10aaf74ca41965c5d8a4925aebb4", size = 4720601, upload-time = "2025-06-26T16:26:52.564Z" }, 513 - { url = "https://files.pythonhosted.org/packages/8f/7f/6b956fab95fa73462bca25d1ea7fc8274ddf68fb8e60b78d56c03b65278e/lxml-6.0.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:1676b56d48048a62ef77a250428d1f31f610763636e0784ba67a9740823988ca", size = 5060227, upload-time = "2025-06-26T16:26:55.054Z" }, 514 - { url = "https://files.pythonhosted.org/packages/97/06/e851ac2924447e8b15a294855caf3d543424364a143c001014d22c8ca94c/lxml-6.0.0-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:0e32698462aacc5c1cf6bdfebc9c781821b7e74c79f13e5ffc8bfe27c42b1abf", size = 4790637, upload-time = "2025-06-26T16:26:57.384Z" }, 515 - { url = "https://files.pythonhosted.org/packages/06/d4/fd216f3cd6625022c25b336c7570d11f4a43adbaf0a56106d3d496f727a7/lxml-6.0.0-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:4d6036c3a296707357efb375cfc24bb64cd955b9ec731abf11ebb1e40063949f", size = 5662049, upload-time = "2025-07-03T19:19:16.409Z" }, 516 - { url = "https://files.pythonhosted.org/packages/52/03/0e764ce00b95e008d76b99d432f1807f3574fb2945b496a17807a1645dbd/lxml-6.0.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:7488a43033c958637b1a08cddc9188eb06d3ad36582cebc7d4815980b47e27ef", size = 5272430, upload-time = "2025-06-26T16:27:00.031Z" }, 517 - { url = "https://files.pythonhosted.org/packages/5f/01/d48cc141bc47bc1644d20fe97bbd5e8afb30415ec94f146f2f76d0d9d098/lxml-6.0.0-cp313-cp313-win32.whl", hash = "sha256:5fcd7d3b1d8ecb91445bd71b9c88bdbeae528fefee4f379895becfc72298d181", size = 3612896, upload-time = "2025-06-26T16:27:04.251Z" }, 518 - { url = "https://files.pythonhosted.org/packages/f4/87/6456b9541d186ee7d4cb53bf1b9a0d7f3b1068532676940fdd594ac90865/lxml-6.0.0-cp313-cp313-win_amd64.whl", hash = "sha256:2f34687222b78fff795feeb799a7d44eca2477c3d9d3a46ce17d51a4f383e32e", size = 4013132, upload-time = "2025-06-26T16:27:06.415Z" }, 519 - { url = "https://files.pythonhosted.org/packages/b7/42/85b3aa8f06ca0d24962f8100f001828e1f1f1a38c954c16e71154ed7d53a/lxml-6.0.0-cp313-cp313-win_arm64.whl", hash = "sha256:21db1ec5525780fd07251636eb5f7acb84003e9382c72c18c542a87c416ade03", size = 3672642, upload-time = "2025-06-26T16:27:09.888Z" }, 520 - { url = "https://files.pythonhosted.org/packages/dc/04/a53941fb0d7c60eed08301942c70aa63650a59308d15e05eb823acbce41d/lxml-6.0.0-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:85b14a4689d5cff426c12eefe750738648706ea2753b20c2f973b2a000d3d261", size = 8407699, upload-time = "2025-06-26T16:27:28.167Z" }, 521 - { url = "https://files.pythonhosted.org/packages/44/d2/e1d4526e903afebe147f858322f1c0b36e44969d5c87e5d243c23f81987f/lxml-6.0.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:f64ccf593916e93b8d36ed55401bb7fe9c7d5de3180ce2e10b08f82a8f397316", size = 4574678, upload-time = "2025-06-26T16:27:30.888Z" }, 522 - { url = "https://files.pythonhosted.org/packages/61/aa/b0a8ee233c00f2f437dbb6e7bd2df115a996d8211b7d03f4ab029b8e3378/lxml-6.0.0-cp39-cp39-manylinux2010_i686.manylinux2014_i686.manylinux_2_12_i686.manylinux_2_17_i686.whl", hash = "sha256:b372d10d17a701b0945f67be58fae4664fd056b85e0ff0fbc1e6c951cdbc0512", size = 5292694, upload-time = "2025-06-26T16:27:34.037Z" }, 523 - { url = "https://files.pythonhosted.org/packages/53/7f/e6f377489b2ac4289418b879c34ed664e5a1174b2a91590936ec4174e773/lxml-6.0.0-cp39-cp39-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:a674c0948789e9136d69065cc28009c1b1874c6ea340253db58be7622ce6398f", size = 5009177, upload-time = "2025-06-28T18:47:39.377Z" }, 524 - { url = "https://files.pythonhosted.org/packages/c6/05/ae239e997374680741b768044545251a29abc21ada42248638dbed749a0a/lxml-6.0.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:edf6e4c8fe14dfe316939711e3ece3f9a20760aabf686051b537a7562f4da91a", size = 5163787, upload-time = "2025-06-28T18:47:42.452Z" }, 525 - { url = "https://files.pythonhosted.org/packages/2a/da/4f27222570d008fd2386e19d6923af6e64c317ee6116bbb2b98247f98f31/lxml-6.0.0-cp39-cp39-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:048a930eb4572829604982e39a0c7289ab5dc8abc7fc9f5aabd6fbc08c154e93", size = 5075755, upload-time = "2025-06-26T16:27:36.611Z" }, 526 - { url = "https://files.pythonhosted.org/packages/1f/65/12552caf7b3e3b9b9aba12349370dc53a36d4058e4ed482811f1d262deee/lxml-6.0.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c0b5fa5eda84057a4f1bbb4bb77a8c28ff20ae7ce211588d698ae453e13c6281", size = 5297070, upload-time = "2025-06-26T16:27:39.232Z" }, 527 - { url = "https://files.pythonhosted.org/packages/3e/6a/f053a8369fdf4e3b8127a6ffb079c519167e684e956a1281392c5c3679b6/lxml-6.0.0-cp39-cp39-manylinux_2_31_armv7l.whl", hash = "sha256:c352fc8f36f7e9727db17adbf93f82499457b3d7e5511368569b4c5bd155a922", size = 4779864, upload-time = "2025-06-26T16:27:41.713Z" }, 528 - { url = "https://files.pythonhosted.org/packages/df/7b/b2a392ad34ce37a17d1cf3aec303e15125768061cf0e355a92d292d20d37/lxml-6.0.0-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:8db5dc617cb937ae17ff3403c3a70a7de9df4852a046f93e71edaec678f721d0", size = 5122039, upload-time = "2025-06-26T16:27:44.252Z" }, 529 - { url = "https://files.pythonhosted.org/packages/80/0e/6459ff8ae7d87188e1f99f11691d0f32831caa6429599c3b289de9f08b21/lxml-6.0.0-cp39-cp39-musllinux_1_2_armv7l.whl", hash = "sha256:2181e4b1d07dde53986023482673c0f1fba5178ef800f9ab95ad791e8bdded6a", size = 4805117, upload-time = "2025-06-26T16:27:46.769Z" }, 530 - { url = "https://files.pythonhosted.org/packages/ca/78/4186f573805ff623d28a8736788a3b29eeaf589afdcf0233de2c9bb9fc50/lxml-6.0.0-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:b3c98d5b24c6095e89e03d65d5c574705be3d49c0d8ca10c17a8a4b5201b72f5", size = 5322300, upload-time = "2025-06-26T16:27:49.278Z" }, 531 - { url = "https://files.pythonhosted.org/packages/e8/97/352e07992901473529c8e19dbfdba6430ba6a37f6b46a4d0fa93321f8fee/lxml-6.0.0-cp39-cp39-win32.whl", hash = "sha256:04d67ceee6db4bcb92987ccb16e53bef6b42ced872509f333c04fb58a3315256", size = 3615832, upload-time = "2025-06-26T16:27:51.728Z" }, 532 - { url = "https://files.pythonhosted.org/packages/71/93/8f3b880e2618e548fb0ca157349abb526d81cb4f01ef5ea3a0f22bd4d0df/lxml-6.0.0-cp39-cp39-win_amd64.whl", hash = "sha256:e0b1520ef900e9ef62e392dd3d7ae4f5fa224d1dd62897a792cf353eb20b6cae", size = 4038551, upload-time = "2025-06-26T16:27:54.193Z" }, 533 - { url = "https://files.pythonhosted.org/packages/e7/8a/046cbf5b262dd2858c6e65833339100fd5f1c017b37b26bc47c92d4584d7/lxml-6.0.0-cp39-cp39-win_arm64.whl", hash = "sha256:e35e8aaaf3981489f42884b59726693de32dabfc438ac10ef4eb3409961fd402", size = 3684237, upload-time = "2025-06-26T16:27:57.117Z" }, 534 - { url = "https://files.pythonhosted.org/packages/66/e1/2c22a3cff9e16e1d717014a1e6ec2bf671bf56ea8716bb64466fcf820247/lxml-6.0.0-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:dbdd7679a6f4f08152818043dbb39491d1af3332128b3752c3ec5cebc0011a72", size = 3898804, upload-time = "2025-06-26T16:27:59.751Z" }, 535 - { url = "https://files.pythonhosted.org/packages/2b/3a/d68cbcb4393a2a0a867528741fafb7ce92dac5c9f4a1680df98e5e53e8f5/lxml-6.0.0-pp310-pypy310_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:40442e2a4456e9910875ac12951476d36c0870dcb38a68719f8c4686609897c4", size = 4216406, upload-time = "2025-06-28T18:47:45.518Z" }, 536 - { url = "https://files.pythonhosted.org/packages/15/8f/d9bfb13dff715ee3b2a1ec2f4a021347ea3caf9aba93dea0cfe54c01969b/lxml-6.0.0-pp310-pypy310_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:db0efd6bae1c4730b9c863fc4f5f3c0fa3e8f05cae2c44ae141cb9dfc7d091dc", size = 4326455, upload-time = "2025-06-28T18:47:48.411Z" }, 537 - { url = "https://files.pythonhosted.org/packages/01/8b/fde194529ee8a27e6f5966d7eef05fa16f0567e4a8e8abc3b855ef6b3400/lxml-6.0.0-pp310-pypy310_pp73-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9ab542c91f5a47aaa58abdd8ea84b498e8e49fe4b883d67800017757a3eb78e8", size = 4268788, upload-time = "2025-06-26T16:28:02.776Z" }, 538 - { url = "https://files.pythonhosted.org/packages/99/a8/3b8e2581b4f8370fc9e8dc343af4abdfadd9b9229970fc71e67bd31c7df1/lxml-6.0.0-pp310-pypy310_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:013090383863b72c62a702d07678b658fa2567aa58d373d963cca245b017e065", size = 4411394, upload-time = "2025-06-26T16:28:05.179Z" }, 539 - { url = "https://files.pythonhosted.org/packages/e7/a5/899a4719e02ff4383f3f96e5d1878f882f734377f10dfb69e73b5f223e44/lxml-6.0.0-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:c86df1c9af35d903d2b52d22ea3e66db8058d21dc0f59842ca5deb0595921141", size = 3517946, upload-time = "2025-06-26T16:28:07.665Z" }, 540 - { url = "https://files.pythonhosted.org/packages/93/e3/ef14f1d23aea1dec1eccbe2c07a93b6d0be693fd9d5f248a47155e436701/lxml-6.0.0-pp39-pypy39_pp73-macosx_10_15_x86_64.whl", hash = "sha256:4337e4aec93b7c011f7ee2e357b0d30562edd1955620fdd4aeab6aacd90d43c5", size = 3892325, upload-time = "2025-06-26T16:28:10.024Z" }, 541 - { url = "https://files.pythonhosted.org/packages/09/8a/1410b9e1ec43f606f9aac0661d09892509d86032e229711798906e1b5e7a/lxml-6.0.0-pp39-pypy39_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:ae74f7c762270196d2dda56f8dd7309411f08a4084ff2dfcc0b095a218df2e06", size = 4210839, upload-time = "2025-06-28T18:47:50.768Z" }, 542 - { url = "https://files.pythonhosted.org/packages/79/cb/6696ce0d1712c5ae94b18bdf225086a5fb04b23938ac4d2011b323b3860b/lxml-6.0.0-pp39-pypy39_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:059c4cbf3973a621b62ea3132934ae737da2c132a788e6cfb9b08d63a0ef73f9", size = 4321235, upload-time = "2025-06-28T18:47:53.338Z" }, 543 - { url = "https://files.pythonhosted.org/packages/f3/98/04997f61d720cf320a0daee66b3096e3a3b57453e15549c14b87058c2acd/lxml-6.0.0-pp39-pypy39_pp73-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:17f090a9bc0ce8da51a5632092f98a7e7f84bca26f33d161a98b57f7fb0004ca", size = 4265071, upload-time = "2025-06-26T16:28:12.367Z" }, 544 - { url = "https://files.pythonhosted.org/packages/e6/86/e5f6fa80154a5f5bf2c1e89d6265892299942edeb115081ca72afe7c7199/lxml-6.0.0-pp39-pypy39_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:9da022c14baeec36edfcc8daf0e281e2f55b950249a455776f0d1adeeada4734", size = 4406816, upload-time = "2025-06-26T16:28:14.744Z" }, 545 - { url = "https://files.pythonhosted.org/packages/18/a6/ae69e0e6f5fb6293eb8cbfbf8a259e37d71608bbae3658a768dd26b69f3e/lxml-6.0.0-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:a55da151d0b0c6ab176b4e761670ac0e2667817a1e0dadd04a01d0561a219349", size = 3515499, upload-time = "2025-06-26T16:28:17.035Z" }, 345 + { url = "https://files.pythonhosted.org/packages/62/a1/3d680cbfd5f4b8f15abc1d571870c5fc3e594bb582bc3b64ea099db13e56/jinja2-3.1.6-py3-none-any.whl", hash = "sha256:85ece4451f492d0c13c5dd7c13a64681a86afae63a5f347908daf103ce6d2f67", size = 134899, upload-time = "2025-03-05T20:05:00.369Z" }, 546 346 ] 547 347 548 348 [[package]] ··· 558 358 ] 559 359 560 360 [[package]] 561 - name = "markdownify" 562 - version = "1.2.0" 361 + name = "markupsafe" 362 + version = "3.0.2" 563 363 source = { registry = "https://pypi.org/simple" } 564 - dependencies = [ 565 - { name = "beautifulsoup4" }, 566 - { name = "six" }, 567 - ] 568 - sdist = { url = "https://files.pythonhosted.org/packages/83/1b/6f2697b51eaca81f08852fd2734745af15718fea10222a1d40f8a239c4ea/markdownify-1.2.0.tar.gz", hash = "sha256:f6c367c54eb24ee953921804dfe6d6575c5e5b42c643955e7242034435de634c", size = 18771, upload-time = "2025-08-09T17:44:15.302Z" } 364 + sdist = { url = "https://files.pythonhosted.org/packages/b2/97/5d42485e71dfc078108a86d6de8fa46db44a1a9295e89c5d6d4a06e23a62/markupsafe-3.0.2.tar.gz", hash = "sha256:ee55d3edf80167e48ea11a923c7386f4669df67d7994554387f84e7d8b0a2bf0", size = 20537, upload-time = "2024-10-18T15:21:54.129Z" } 569 365 wheels = [ 570 - { url = "https://files.pythonhosted.org/packages/6a/e2/7af643acb4cae0741dffffaa7f3f7c9e7ab4046724543ba1777c401d821c/markdownify-1.2.0-py3-none-any.whl", hash = "sha256:48e150a1c4993d4d50f282f725c0111bd9eb25645d41fa2f543708fd44161351", size = 15561, upload-time = "2025-08-09T17:44:14.074Z" }, 366 + { url = "https://files.pythonhosted.org/packages/04/90/d08277ce111dd22f77149fd1a5d4653eeb3b3eaacbdfcbae5afb2600eebd/MarkupSafe-3.0.2-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:7e94c425039cde14257288fd61dcfb01963e658efbc0ff54f5306b06054700f8", size = 14357, upload-time = "2024-10-18T15:20:51.44Z" }, 367 + { url = "https://files.pythonhosted.org/packages/04/e1/6e2194baeae0bca1fae6629dc0cbbb968d4d941469cbab11a3872edff374/MarkupSafe-3.0.2-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:9e2d922824181480953426608b81967de705c3cef4d1af983af849d7bd619158", size = 12393, upload-time = "2024-10-18T15:20:52.426Z" }, 368 + { url = "https://files.pythonhosted.org/packages/1d/69/35fa85a8ece0a437493dc61ce0bb6d459dcba482c34197e3efc829aa357f/MarkupSafe-3.0.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:38a9ef736c01fccdd6600705b09dc574584b89bea478200c5fbf112a6b0d5579", size = 21732, upload-time = "2024-10-18T15:20:53.578Z" }, 369 + { url = "https://files.pythonhosted.org/packages/22/35/137da042dfb4720b638d2937c38a9c2df83fe32d20e8c8f3185dbfef05f7/MarkupSafe-3.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bbcb445fa71794da8f178f0f6d66789a28d7319071af7a496d4d507ed566270d", size = 20866, upload-time = "2024-10-18T15:20:55.06Z" }, 370 + { url = "https://files.pythonhosted.org/packages/29/28/6d029a903727a1b62edb51863232152fd335d602def598dade38996887f0/MarkupSafe-3.0.2-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:57cb5a3cf367aeb1d316576250f65edec5bb3be939e9247ae594b4bcbc317dfb", size = 20964, upload-time = "2024-10-18T15:20:55.906Z" }, 371 + { url = "https://files.pythonhosted.org/packages/cc/cd/07438f95f83e8bc028279909d9c9bd39e24149b0d60053a97b2bc4f8aa51/MarkupSafe-3.0.2-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:3809ede931876f5b2ec92eef964286840ed3540dadf803dd570c3b7e13141a3b", size = 21977, upload-time = "2024-10-18T15:20:57.189Z" }, 372 + { url = "https://files.pythonhosted.org/packages/29/01/84b57395b4cc062f9c4c55ce0df7d3108ca32397299d9df00fedd9117d3d/MarkupSafe-3.0.2-cp310-cp310-musllinux_1_2_i686.whl", hash = "sha256:e07c3764494e3776c602c1e78e298937c3315ccc9043ead7e685b7f2b8d47b3c", size = 21366, upload-time = "2024-10-18T15:20:58.235Z" }, 373 + { url = "https://files.pythonhosted.org/packages/bd/6e/61ebf08d8940553afff20d1fb1ba7294b6f8d279df9fd0c0db911b4bbcfd/MarkupSafe-3.0.2-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:b424c77b206d63d500bcb69fa55ed8d0e6a3774056bdc4839fc9298a7edca171", size = 21091, upload-time = "2024-10-18T15:20:59.235Z" }, 374 + { url = "https://files.pythonhosted.org/packages/11/23/ffbf53694e8c94ebd1e7e491de185124277964344733c45481f32ede2499/MarkupSafe-3.0.2-cp310-cp310-win32.whl", hash = "sha256:fcabf5ff6eea076f859677f5f0b6b5c1a51e70a376b0579e0eadef8db48c6b50", size = 15065, upload-time = "2024-10-18T15:21:00.307Z" }, 375 + { url = "https://files.pythonhosted.org/packages/44/06/e7175d06dd6e9172d4a69a72592cb3f7a996a9c396eee29082826449bbc3/MarkupSafe-3.0.2-cp310-cp310-win_amd64.whl", hash = "sha256:6af100e168aa82a50e186c82875a5893c5597a0c1ccdb0d8b40240b1f28b969a", size = 15514, upload-time = "2024-10-18T15:21:01.122Z" }, 376 + { url = "https://files.pythonhosted.org/packages/6b/28/bbf83e3f76936960b850435576dd5e67034e200469571be53f69174a2dfd/MarkupSafe-3.0.2-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:9025b4018f3a1314059769c7bf15441064b2207cb3f065e6ea1e7359cb46db9d", size = 14353, upload-time = "2024-10-18T15:21:02.187Z" }, 377 + { url = "https://files.pythonhosted.org/packages/6c/30/316d194b093cde57d448a4c3209f22e3046c5bb2fb0820b118292b334be7/MarkupSafe-3.0.2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:93335ca3812df2f366e80509ae119189886b0f3c2b81325d39efdb84a1e2ae93", size = 12392, upload-time = "2024-10-18T15:21:02.941Z" }, 378 + { url = "https://files.pythonhosted.org/packages/f2/96/9cdafba8445d3a53cae530aaf83c38ec64c4d5427d975c974084af5bc5d2/MarkupSafe-3.0.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2cb8438c3cbb25e220c2ab33bb226559e7afb3baec11c4f218ffa7308603c832", size = 23984, upload-time = "2024-10-18T15:21:03.953Z" }, 379 + { url = "https://files.pythonhosted.org/packages/f1/a4/aefb044a2cd8d7334c8a47d3fb2c9f328ac48cb349468cc31c20b539305f/MarkupSafe-3.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a123e330ef0853c6e822384873bef7507557d8e4a082961e1defa947aa59ba84", size = 23120, upload-time = "2024-10-18T15:21:06.495Z" }, 380 + { url = "https://files.pythonhosted.org/packages/8d/21/5e4851379f88f3fad1de30361db501300d4f07bcad047d3cb0449fc51f8c/MarkupSafe-3.0.2-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:1e084f686b92e5b83186b07e8a17fc09e38fff551f3602b249881fec658d3eca", size = 23032, upload-time = "2024-10-18T15:21:07.295Z" }, 381 + { url = "https://files.pythonhosted.org/packages/00/7b/e92c64e079b2d0d7ddf69899c98842f3f9a60a1ae72657c89ce2655c999d/MarkupSafe-3.0.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:d8213e09c917a951de9d09ecee036d5c7d36cb6cb7dbaece4c71a60d79fb9798", size = 24057, upload-time = "2024-10-18T15:21:08.073Z" }, 382 + { url = "https://files.pythonhosted.org/packages/f9/ac/46f960ca323037caa0a10662ef97d0a4728e890334fc156b9f9e52bcc4ca/MarkupSafe-3.0.2-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:5b02fb34468b6aaa40dfc198d813a641e3a63b98c2b05a16b9f80b7ec314185e", size = 23359, upload-time = "2024-10-18T15:21:09.318Z" }, 383 + { url = "https://files.pythonhosted.org/packages/69/84/83439e16197337b8b14b6a5b9c2105fff81d42c2a7c5b58ac7b62ee2c3b1/MarkupSafe-3.0.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:0bff5e0ae4ef2e1ae4fdf2dfd5b76c75e5c2fa4132d05fc1b0dabcd20c7e28c4", size = 23306, upload-time = "2024-10-18T15:21:10.185Z" }, 384 + { url = "https://files.pythonhosted.org/packages/9a/34/a15aa69f01e2181ed8d2b685c0d2f6655d5cca2c4db0ddea775e631918cd/MarkupSafe-3.0.2-cp311-cp311-win32.whl", hash = "sha256:6c89876f41da747c8d3677a2b540fb32ef5715f97b66eeb0c6b66f5e3ef6f59d", size = 15094, upload-time = "2024-10-18T15:21:11.005Z" }, 385 + { url = "https://files.pythonhosted.org/packages/da/b8/3a3bd761922d416f3dc5d00bfbed11f66b1ab89a0c2b6e887240a30b0f6b/MarkupSafe-3.0.2-cp311-cp311-win_amd64.whl", hash = "sha256:70a87b411535ccad5ef2f1df5136506a10775d267e197e4cf531ced10537bd6b", size = 15521, upload-time = "2024-10-18T15:21:12.911Z" }, 386 + { url = "https://files.pythonhosted.org/packages/22/09/d1f21434c97fc42f09d290cbb6350d44eb12f09cc62c9476effdb33a18aa/MarkupSafe-3.0.2-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:9778bd8ab0a994ebf6f84c2b949e65736d5575320a17ae8984a77fab08db94cf", size = 14274, upload-time = "2024-10-18T15:21:13.777Z" }, 387 + { url = "https://files.pythonhosted.org/packages/6b/b0/18f76bba336fa5aecf79d45dcd6c806c280ec44538b3c13671d49099fdd0/MarkupSafe-3.0.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:846ade7b71e3536c4e56b386c2a47adf5741d2d8b94ec9dc3e92e5e1ee1e2225", size = 12348, upload-time = "2024-10-18T15:21:14.822Z" }, 388 + { url = "https://files.pythonhosted.org/packages/e0/25/dd5c0f6ac1311e9b40f4af06c78efde0f3b5cbf02502f8ef9501294c425b/MarkupSafe-3.0.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1c99d261bd2d5f6b59325c92c73df481e05e57f19837bdca8413b9eac4bd8028", size = 24149, upload-time = "2024-10-18T15:21:15.642Z" }, 389 + { url = "https://files.pythonhosted.org/packages/f3/f0/89e7aadfb3749d0f52234a0c8c7867877876e0a20b60e2188e9850794c17/MarkupSafe-3.0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e17c96c14e19278594aa4841ec148115f9c7615a47382ecb6b82bd8fea3ab0c8", size = 23118, upload-time = "2024-10-18T15:21:17.133Z" }, 390 + { url = "https://files.pythonhosted.org/packages/d5/da/f2eeb64c723f5e3777bc081da884b414671982008c47dcc1873d81f625b6/MarkupSafe-3.0.2-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:88416bd1e65dcea10bc7569faacb2c20ce071dd1f87539ca2ab364bf6231393c", size = 22993, upload-time = "2024-10-18T15:21:18.064Z" }, 391 + { url = "https://files.pythonhosted.org/packages/da/0e/1f32af846df486dce7c227fe0f2398dc7e2e51d4a370508281f3c1c5cddc/MarkupSafe-3.0.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:2181e67807fc2fa785d0592dc2d6206c019b9502410671cc905d132a92866557", size = 24178, upload-time = "2024-10-18T15:21:18.859Z" }, 392 + { url = "https://files.pythonhosted.org/packages/c4/f6/bb3ca0532de8086cbff5f06d137064c8410d10779c4c127e0e47d17c0b71/MarkupSafe-3.0.2-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:52305740fe773d09cffb16f8ed0427942901f00adedac82ec8b67752f58a1b22", size = 23319, upload-time = "2024-10-18T15:21:19.671Z" }, 393 + { url = "https://files.pythonhosted.org/packages/a2/82/8be4c96ffee03c5b4a034e60a31294daf481e12c7c43ab8e34a1453ee48b/MarkupSafe-3.0.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:ad10d3ded218f1039f11a75f8091880239651b52e9bb592ca27de44eed242a48", size = 23352, upload-time = "2024-10-18T15:21:20.971Z" }, 394 + { url = "https://files.pythonhosted.org/packages/51/ae/97827349d3fcffee7e184bdf7f41cd6b88d9919c80f0263ba7acd1bbcb18/MarkupSafe-3.0.2-cp312-cp312-win32.whl", hash = "sha256:0f4ca02bea9a23221c0182836703cbf8930c5e9454bacce27e767509fa286a30", size = 15097, upload-time = "2024-10-18T15:21:22.646Z" }, 395 + { url = "https://files.pythonhosted.org/packages/c1/80/a61f99dc3a936413c3ee4e1eecac96c0da5ed07ad56fd975f1a9da5bc630/MarkupSafe-3.0.2-cp312-cp312-win_amd64.whl", hash = "sha256:8e06879fc22a25ca47312fbe7c8264eb0b662f6db27cb2d3bbbc74b1df4b9b87", size = 15601, upload-time = "2024-10-18T15:21:23.499Z" }, 396 + { url = "https://files.pythonhosted.org/packages/83/0e/67eb10a7ecc77a0c2bbe2b0235765b98d164d81600746914bebada795e97/MarkupSafe-3.0.2-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:ba9527cdd4c926ed0760bc301f6728ef34d841f405abf9d4f959c478421e4efd", size = 14274, upload-time = "2024-10-18T15:21:24.577Z" }, 397 + { url = "https://files.pythonhosted.org/packages/2b/6d/9409f3684d3335375d04e5f05744dfe7e9f120062c9857df4ab490a1031a/MarkupSafe-3.0.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:f8b3d067f2e40fe93e1ccdd6b2e1d16c43140e76f02fb1319a05cf2b79d99430", size = 12352, upload-time = "2024-10-18T15:21:25.382Z" }, 398 + { url = "https://files.pythonhosted.org/packages/d2/f5/6eadfcd3885ea85fe2a7c128315cc1bb7241e1987443d78c8fe712d03091/MarkupSafe-3.0.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:569511d3b58c8791ab4c2e1285575265991e6d8f8700c7be0e88f86cb0672094", size = 24122, upload-time = "2024-10-18T15:21:26.199Z" }, 399 + { url = "https://files.pythonhosted.org/packages/0c/91/96cf928db8236f1bfab6ce15ad070dfdd02ed88261c2afafd4b43575e9e9/MarkupSafe-3.0.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:15ab75ef81add55874e7ab7055e9c397312385bd9ced94920f2802310c930396", size = 23085, upload-time = "2024-10-18T15:21:27.029Z" }, 400 + { url = "https://files.pythonhosted.org/packages/c2/cf/c9d56af24d56ea04daae7ac0940232d31d5a8354f2b457c6d856b2057d69/MarkupSafe-3.0.2-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:f3818cb119498c0678015754eba762e0d61e5b52d34c8b13d770f0719f7b1d79", size = 22978, upload-time = "2024-10-18T15:21:27.846Z" }, 401 + { url = "https://files.pythonhosted.org/packages/2a/9f/8619835cd6a711d6272d62abb78c033bda638fdc54c4e7f4272cf1c0962b/MarkupSafe-3.0.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:cdb82a876c47801bb54a690c5ae105a46b392ac6099881cdfb9f6e95e4014c6a", size = 24208, upload-time = "2024-10-18T15:21:28.744Z" }, 402 + { url = "https://files.pythonhosted.org/packages/f9/bf/176950a1792b2cd2102b8ffeb5133e1ed984547b75db47c25a67d3359f77/MarkupSafe-3.0.2-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:cabc348d87e913db6ab4aa100f01b08f481097838bdddf7c7a84b7575b7309ca", size = 23357, upload-time = "2024-10-18T15:21:29.545Z" }, 403 + { url = "https://files.pythonhosted.org/packages/ce/4f/9a02c1d335caabe5c4efb90e1b6e8ee944aa245c1aaaab8e8a618987d816/MarkupSafe-3.0.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:444dcda765c8a838eaae23112db52f1efaf750daddb2d9ca300bcae1039adc5c", size = 23344, upload-time = "2024-10-18T15:21:30.366Z" }, 404 + { url = "https://files.pythonhosted.org/packages/ee/55/c271b57db36f748f0e04a759ace9f8f759ccf22b4960c270c78a394f58be/MarkupSafe-3.0.2-cp313-cp313-win32.whl", hash = "sha256:bcf3e58998965654fdaff38e58584d8937aa3096ab5354d493c77d1fdd66d7a1", size = 15101, upload-time = "2024-10-18T15:21:31.207Z" }, 405 + { url = "https://files.pythonhosted.org/packages/29/88/07df22d2dd4df40aba9f3e402e6dc1b8ee86297dddbad4872bd5e7b0094f/MarkupSafe-3.0.2-cp313-cp313-win_amd64.whl", hash = "sha256:e6a2a455bd412959b57a172ce6328d2dd1f01cb2135efda2e4576e8a23fa3b0f", size = 15603, upload-time = "2024-10-18T15:21:32.032Z" }, 406 + { url = "https://files.pythonhosted.org/packages/62/6a/8b89d24db2d32d433dffcd6a8779159da109842434f1dd2f6e71f32f738c/MarkupSafe-3.0.2-cp313-cp313t-macosx_10_13_universal2.whl", hash = "sha256:b5a6b3ada725cea8a5e634536b1b01c30bcdcd7f9c6fff4151548d5bf6b3a36c", size = 14510, upload-time = "2024-10-18T15:21:33.625Z" }, 407 + { url = "https://files.pythonhosted.org/packages/7a/06/a10f955f70a2e5a9bf78d11a161029d278eeacbd35ef806c3fd17b13060d/MarkupSafe-3.0.2-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:a904af0a6162c73e3edcb969eeeb53a63ceeb5d8cf642fade7d39e7963a22ddb", size = 12486, upload-time = "2024-10-18T15:21:34.611Z" }, 408 + { url = "https://files.pythonhosted.org/packages/34/cf/65d4a571869a1a9078198ca28f39fba5fbb910f952f9dbc5220afff9f5e6/MarkupSafe-3.0.2-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:4aa4e5faecf353ed117801a068ebab7b7e09ffb6e1d5e412dc852e0da018126c", size = 25480, upload-time = "2024-10-18T15:21:35.398Z" }, 409 + { url = "https://files.pythonhosted.org/packages/0c/e3/90e9651924c430b885468b56b3d597cabf6d72be4b24a0acd1fa0e12af67/MarkupSafe-3.0.2-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c0ef13eaeee5b615fb07c9a7dadb38eac06a0608b41570d8ade51c56539e509d", size = 23914, upload-time = "2024-10-18T15:21:36.231Z" }, 410 + { url = "https://files.pythonhosted.org/packages/66/8c/6c7cf61f95d63bb866db39085150df1f2a5bd3335298f14a66b48e92659c/MarkupSafe-3.0.2-cp313-cp313t-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d16a81a06776313e817c951135cf7340a3e91e8c1ff2fac444cfd75fffa04afe", size = 23796, upload-time = "2024-10-18T15:21:37.073Z" }, 411 + { url = "https://files.pythonhosted.org/packages/bb/35/cbe9238ec3f47ac9a7c8b3df7a808e7cb50fe149dc7039f5f454b3fba218/MarkupSafe-3.0.2-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:6381026f158fdb7c72a168278597a5e3a5222e83ea18f543112b2662a9b699c5", size = 25473, upload-time = "2024-10-18T15:21:37.932Z" }, 412 + { url = "https://files.pythonhosted.org/packages/e6/32/7621a4382488aa283cc05e8984a9c219abad3bca087be9ec77e89939ded9/MarkupSafe-3.0.2-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:3d79d162e7be8f996986c064d1c7c817f6df3a77fe3d6859f6f9e7be4b8c213a", size = 24114, upload-time = "2024-10-18T15:21:39.799Z" }, 413 + { url = "https://files.pythonhosted.org/packages/0d/80/0985960e4b89922cb5a0bac0ed39c5b96cbc1a536a99f30e8c220a996ed9/MarkupSafe-3.0.2-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:131a3c7689c85f5ad20f9f6fb1b866f402c445b220c19fe4308c0b147ccd2ad9", size = 24098, upload-time = "2024-10-18T15:21:40.813Z" }, 414 + { url = "https://files.pythonhosted.org/packages/82/78/fedb03c7d5380df2427038ec8d973587e90561b2d90cd472ce9254cf348b/MarkupSafe-3.0.2-cp313-cp313t-win32.whl", hash = "sha256:ba8062ed2cf21c07a9e295d5b8a2a5ce678b913b45fdf68c32d95d6c1291e0b6", size = 15208, upload-time = "2024-10-18T15:21:41.814Z" }, 415 + { url = "https://files.pythonhosted.org/packages/4f/65/6079a46068dfceaeabb5dcad6d674f5f5c61a6fa5673746f42a9f4c233b3/MarkupSafe-3.0.2-cp313-cp313t-win_amd64.whl", hash = "sha256:e444a31f8db13eb18ada366ab3cf45fd4b31e4db1236a4448f68778c1d1a5a2f", size = 15739, upload-time = "2024-10-18T15:21:42.784Z" }, 416 + { url = "https://files.pythonhosted.org/packages/a7/ea/9b1530c3fdeeca613faeb0fb5cbcf2389d816072fab72a71b45749ef6062/MarkupSafe-3.0.2-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:eaa0a10b7f72326f1372a713e73c3f739b524b3af41feb43e4921cb529f5929a", size = 14344, upload-time = "2024-10-18T15:21:43.721Z" }, 417 + { url = "https://files.pythonhosted.org/packages/4b/c2/fbdbfe48848e7112ab05e627e718e854d20192b674952d9042ebd8c9e5de/MarkupSafe-3.0.2-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:48032821bbdf20f5799ff537c7ac3d1fba0ba032cfc06194faffa8cda8b560ff", size = 12389, upload-time = "2024-10-18T15:21:44.666Z" }, 418 + { url = "https://files.pythonhosted.org/packages/f0/25/7a7c6e4dbd4f867d95d94ca15449e91e52856f6ed1905d58ef1de5e211d0/MarkupSafe-3.0.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1a9d3f5f0901fdec14d8d2f66ef7d035f2157240a433441719ac9a3fba440b13", size = 21607, upload-time = "2024-10-18T15:21:45.452Z" }, 419 + { url = "https://files.pythonhosted.org/packages/53/8f/f339c98a178f3c1e545622206b40986a4c3307fe39f70ccd3d9df9a9e425/MarkupSafe-3.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:88b49a3b9ff31e19998750c38e030fc7bb937398b1f78cfa599aaef92d693144", size = 20728, upload-time = "2024-10-18T15:21:46.295Z" }, 420 + { url = "https://files.pythonhosted.org/packages/1a/03/8496a1a78308456dbd50b23a385c69b41f2e9661c67ea1329849a598a8f9/MarkupSafe-3.0.2-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:cfad01eed2c2e0c01fd0ecd2ef42c492f7f93902e39a42fc9ee1692961443a29", size = 20826, upload-time = "2024-10-18T15:21:47.134Z" }, 421 + { url = "https://files.pythonhosted.org/packages/e6/cf/0a490a4bd363048c3022f2f475c8c05582179bb179defcee4766fb3dcc18/MarkupSafe-3.0.2-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:1225beacc926f536dc82e45f8a4d68502949dc67eea90eab715dea3a21c1b5f0", size = 21843, upload-time = "2024-10-18T15:21:48.334Z" }, 422 + { url = "https://files.pythonhosted.org/packages/19/a3/34187a78613920dfd3cdf68ef6ce5e99c4f3417f035694074beb8848cd77/MarkupSafe-3.0.2-cp39-cp39-musllinux_1_2_i686.whl", hash = "sha256:3169b1eefae027567d1ce6ee7cae382c57fe26e82775f460f0b2778beaad66c0", size = 21219, upload-time = "2024-10-18T15:21:49.587Z" }, 423 + { url = "https://files.pythonhosted.org/packages/17/d8/5811082f85bb88410ad7e452263af048d685669bbbfb7b595e8689152498/MarkupSafe-3.0.2-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:eb7972a85c54febfb25b5c4b4f3af4dcc731994c7da0d8a0b4a6eb0640e1d178", size = 20946, upload-time = "2024-10-18T15:21:50.441Z" }, 424 + { url = "https://files.pythonhosted.org/packages/7c/31/bd635fb5989440d9365c5e3c47556cfea121c7803f5034ac843e8f37c2f2/MarkupSafe-3.0.2-cp39-cp39-win32.whl", hash = "sha256:8c4e8c3ce11e1f92f6536ff07154f9d49677ebaaafc32db9db4620bc11ed480f", size = 15063, upload-time = "2024-10-18T15:21:51.385Z" }, 425 + { url = "https://files.pythonhosted.org/packages/b3/73/085399401383ce949f727afec55ec3abd76648d04b9f22e1c0e99cb4bec3/MarkupSafe-3.0.2-cp39-cp39-win_amd64.whl", hash = "sha256:6e296a513ca3d94054c2c881cc913116e90fd030ad1c656b3869762b754f5f8a", size = 15506, upload-time = "2024-10-18T15:21:52.974Z" }, 571 426 ] 572 427 573 428 [[package]] ··· 724 579 { url = "https://files.pythonhosted.org/packages/0b/c7/d3654a790129684d0e8dc04707cb6d75633d7b102a962c6dc0f862c64c25/pendulum-3.1.0-pp39-pypy39_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:e4cbd933a40c915ed5c41b083115cca15c7afa8179363b2a61db167c64fa0670", size = 526685, upload-time = "2025-04-19T14:02:31.523Z" }, 725 580 { url = "https://files.pythonhosted.org/packages/50/d9/4a166256386b7973e36ff44135e8d009f4afb25d6c72df5380ccfd6fbb89/pendulum-3.1.0-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:3363a470b5d67dbf8d9fd1bf77dcdbf720788bc3be4a10bdcd28ae5d7dbd26c4", size = 261170, upload-time = "2025-04-19T14:02:33.099Z" }, 726 581 { url = "https://files.pythonhosted.org/packages/6e/23/e98758924d1b3aac11a626268eabf7f3cf177e7837c28d47bf84c64532d0/pendulum-3.1.0-py3-none-any.whl", hash = "sha256:f9178c2a8e291758ade1e8dd6371b1d26d08371b4c7730a6e9a3ef8b16ebae0f", size = 111799, upload-time = "2025-04-19T14:02:34.739Z" }, 727 - ] 728 - 729 - [[package]] 730 - name = "pip" 731 - version = "25.2" 732 - source = { registry = "https://pypi.org/simple" } 733 - sdist = { url = "https://files.pythonhosted.org/packages/20/16/650289cd3f43d5a2fadfd98c68bd1e1e7f2550a1a5326768cddfbcedb2c5/pip-25.2.tar.gz", hash = "sha256:578283f006390f85bb6282dffb876454593d637f5d1be494b5202ce4877e71f2", size = 1840021, upload-time = "2025-07-30T21:50:15.401Z" } 734 - wheels = [ 735 - { url = "https://files.pythonhosted.org/packages/b7/3f/945ef7ab14dc4f9d7f40288d2df998d1837ee0888ec3659c813487572faa/pip-25.2-py3-none-any.whl", hash = "sha256:6d67a2b4e7f14d8b31b8b52648866fa717f45a1eb70e83002f4331d07e953717", size = 1752557, upload-time = "2025-07-30T21:50:13.323Z" }, 736 582 ] 737 583 738 584 [[package]] ··· 1020 866 ] 1021 867 1022 868 [[package]] 1023 - name = "requests" 1024 - version = "2.32.4" 1025 - source = { registry = "https://pypi.org/simple" } 1026 - dependencies = [ 1027 - { name = "certifi" }, 1028 - { name = "charset-normalizer" }, 1029 - { name = "idna" }, 1030 - { name = "urllib3" }, 1031 - ] 1032 - sdist = { url = "https://files.pythonhosted.org/packages/e1/0a/929373653770d8a0d7ea76c37de6e41f11eb07559b103b1c02cafb3f7cf8/requests-2.32.4.tar.gz", hash = "sha256:27d0316682c8a29834d3264820024b62a36942083d52caf2f14c0591336d3422", size = 135258, upload-time = "2025-06-09T16:43:07.34Z" } 1033 - wheels = [ 1034 - { url = "https://files.pythonhosted.org/packages/7c/e4/56027c4a6b4ae70ca9de302488c5ca95ad4a39e190093d6c1a8ace08341b/requests-2.32.4-py3-none-any.whl", hash = "sha256:27babd3cda2a6d50b30443204ee89830707d396671944c998b5975b031ac2b2c", size = 64847, upload-time = "2025-06-09T16:43:05.728Z" }, 1035 - ] 1036 - 1037 - [[package]] 1038 869 name = "rich" 1039 870 version = "14.0.0" 1040 871 source = { registry = "https://pypi.org/simple" } ··· 1116 947 ] 1117 948 1118 949 [[package]] 1119 - name = "soupsieve" 1120 - version = "2.7" 1121 - source = { registry = "https://pypi.org/simple" } 1122 - sdist = { url = "https://files.pythonhosted.org/packages/3f/f4/4a80cd6ef364b2e8b65b15816a843c0980f7a5a2b4dc701fc574952aa19f/soupsieve-2.7.tar.gz", hash = "sha256:ad282f9b6926286d2ead4750552c8a6142bc4c783fd66b0293547c8fe6ae126a", size = 103418, upload-time = "2025-04-20T18:50:08.518Z" } 1123 - wheels = [ 1124 - { url = "https://files.pythonhosted.org/packages/e7/9c/0e6afc12c269578be5c0c1c9f4b49a8d32770a080260c333ac04cc1c832d/soupsieve-2.7-py3-none-any.whl", hash = "sha256:6e60cc5c1ffaf1cebcc12e8188320b72071e922c2e897f737cadce79ad5d30c4", size = 36677, upload-time = "2025-04-20T18:50:07.196Z" }, 1125 - ] 1126 - 1127 - [[package]] 1128 950 name = "thicket" 1129 951 source = { editable = "." } 1130 952 dependencies = [ ··· 1133 955 { name = "feedparser" }, 1134 956 { name = "gitpython" }, 1135 957 { name = "httpx" }, 1136 - { name = "importlib-metadata" }, 1137 - { name = "markdownify" }, 958 + { name = "jinja2" }, 1138 959 { name = "pendulum" }, 1139 960 { name = "platformdirs" }, 1140 961 { name = "pydantic" }, ··· 1142 963 { name = "pyyaml" }, 1143 964 { name = "rich" }, 1144 965 { name = "typer" }, 1145 - { name = "typesense" }, 1146 - { name = "zulip" }, 1147 - { name = "zulip-bots" }, 1148 966 ] 1149 967 1150 968 [package.optional-dependencies] ··· 1156 974 { name = "pytest-cov" }, 1157 975 { name = "ruff" }, 1158 976 { name = "types-pyyaml" }, 1159 - ] 1160 - 1161 - [package.dev-dependencies] 1162 - dev = [ 1163 - { name = "mypy" }, 1164 - { name = "pytest" }, 1165 977 ] 1166 978 1167 979 [package.metadata] ··· 1172 984 { name = "feedparser", specifier = ">=6.0.11" }, 1173 985 { name = "gitpython", specifier = ">=3.1.40" }, 1174 986 { name = "httpx", specifier = ">=0.28.0" }, 1175 - { name = "importlib-metadata", specifier = ">=8.7.0" }, 1176 - { name = "markdownify", specifier = ">=1.2.0" }, 987 + { name = "jinja2", specifier = ">=3.1.6" }, 1177 988 { name = "mypy", marker = "extra == 'dev'", specifier = ">=1.13.0" }, 1178 989 { name = "pendulum", specifier = ">=3.0.0" }, 1179 990 { name = "platformdirs", specifier = ">=4.0.0" }, ··· 1187 998 { name = "ruff", marker = "extra == 'dev'", specifier = ">=0.8.0" }, 1188 999 { name = "typer", specifier = ">=0.15.0" }, 1189 1000 { name = "types-pyyaml", marker = "extra == 'dev'", specifier = ">=6.0.0" }, 1190 - { name = "typesense", specifier = ">=1.1.1" }, 1191 - { name = "zulip", specifier = ">=0.9.0" }, 1192 - { name = "zulip-bots", specifier = ">=0.9.0" }, 1193 1001 ] 1194 1002 provides-extras = ["dev"] 1195 - 1196 - [package.metadata.requires-dev] 1197 - dev = [ 1198 - { name = "mypy", specifier = ">=1.17.0" }, 1199 - { name = "pytest", specifier = ">=8.4.1" }, 1200 - ] 1201 1003 1202 1004 [[package]] 1203 1005 name = "tomli" ··· 1264 1066 ] 1265 1067 1266 1068 [[package]] 1267 - name = "typesense" 1268 - version = "1.1.1" 1269 - source = { registry = "https://pypi.org/simple" } 1270 - dependencies = [ 1271 - { name = "requests" }, 1272 - ] 1273 - sdist = { url = "https://files.pythonhosted.org/packages/9b/2c/6f012a17934d50f73d20f1138b3bc42cfb7ec465052bd8e56c0dcf8ce92d/typesense-1.1.1.tar.gz", hash = "sha256:876280e5f2bb8a4a24ae427863ee8216d2e9e76cfe96e0a87a379e66078dc591", size = 45214, upload-time = "2025-05-20T18:13:32.865Z" } 1274 - wheels = [ 1275 - { url = "https://files.pythonhosted.org/packages/1b/8f/6306446e5ce28ddddd8babf407597b9afa3fff521794fe2dcfb16f12e16a/typesense-1.1.1-py3-none-any.whl", hash = "sha256:633aeb26c24e17be654ea22f20d3f76f87c804f259d0a560b7e0ae817f24077a", size = 70604, upload-time = "2025-05-20T18:13:30.975Z" }, 1276 - ] 1277 - 1278 - [[package]] 1279 1069 name = "typing-extensions" 1280 1070 version = "4.14.1" 1281 1071 source = { registry = "https://pypi.org/simple" } ··· 1306 1096 ] 1307 1097 1308 1098 [[package]] 1309 - name = "urllib3" 1310 - version = "2.5.0" 1311 - source = { registry = "https://pypi.org/simple" } 1312 - sdist = { url = "https://files.pythonhosted.org/packages/15/22/9ee70a2574a4f4599c47dd506532914ce044817c7752a79b6a51286319bc/urllib3-2.5.0.tar.gz", hash = "sha256:3fc47733c7e419d4bc3f6b3dc2b4f890bb743906a30d56ba4a5bfa4bbff92760", size = 393185, upload-time = "2025-06-18T14:07:41.644Z" } 1313 - wheels = [ 1314 - { url = "https://files.pythonhosted.org/packages/a7/c2/fe1e52489ae3122415c51f387e221dd0773709bad6c6cdaa599e8a2c5185/urllib3-2.5.0-py3-none-any.whl", hash = "sha256:e6b01673c0fa6a13e374b50871808eb3bf7046c4b125b216f6bf1cc604cff0dc", size = 129795, upload-time = "2025-06-18T14:07:40.39Z" }, 1315 - ] 1316 - 1317 - [[package]] 1318 1099 name = "webencodings" 1319 1100 version = "0.5.1" 1320 1101 source = { registry = "https://pypi.org/simple" } ··· 1322 1103 wheels = [ 1323 1104 { url = "https://files.pythonhosted.org/packages/f4/24/2a3e3df732393fed8b3ebf2ec078f05546de641fe1b667ee316ec1dcf3b7/webencodings-0.5.1-py2.py3-none-any.whl", hash = "sha256:a0af1213f3c2226497a97e2b3aa01a7e4bee4f403f95be16fc9acd2947514a78", size = 11774, upload-time = "2017-04-05T20:21:32.581Z" }, 1324 1105 ] 1325 - 1326 - [[package]] 1327 - name = "zipp" 1328 - version = "3.23.0" 1329 - source = { registry = "https://pypi.org/simple" } 1330 - sdist = { url = "https://files.pythonhosted.org/packages/e3/02/0f2892c661036d50ede074e376733dca2ae7c6eb617489437771209d4180/zipp-3.23.0.tar.gz", hash = "sha256:a07157588a12518c9d4034df3fbbee09c814741a33ff63c05fa29d26a2404166", size = 25547, upload-time = "2025-06-08T17:06:39.4Z" } 1331 - wheels = [ 1332 - { url = "https://files.pythonhosted.org/packages/2e/54/647ade08bf0db230bfea292f893923872fd20be6ac6f53b2b936ba839d75/zipp-3.23.0-py3-none-any.whl", hash = "sha256:071652d6115ed432f5ce1d34c336c0adfd6a884660d1e9712a256d3d3bd4b14e", size = 10276, upload-time = "2025-06-08T17:06:38.034Z" }, 1333 - ] 1334 - 1335 - [[package]] 1336 - name = "zulip" 1337 - version = "0.9.0" 1338 - source = { registry = "https://pypi.org/simple" } 1339 - dependencies = [ 1340 - { name = "click", version = "8.1.8", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.10'" }, 1341 - { name = "click", version = "8.2.1", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.10'" }, 1342 - { name = "distro" }, 1343 - { name = "requests" }, 1344 - { name = "typing-extensions" }, 1345 - ] 1346 - sdist = { url = "https://files.pythonhosted.org/packages/7e/85/754c025bf7e5ff2622b89c555ff3e1ecc3dd501874745a7ec2c3b59fc743/zulip-0.9.0.tar.gz", hash = "sha256:7a14149e5d9e3fcc53b13e998719fd1f6ccb8289bc60fccbaa1aafcd0a9d0843", size = 134624, upload-time = "2023-11-15T00:28:39.338Z" } 1347 - wheels = [ 1348 - { url = "https://files.pythonhosted.org/packages/db/ed/81e42dbfe0dd538f60514d0e4849b872d949a1caa7a2c80bbe6aa4c1bae9/zulip-0.9.0-py3-none-any.whl", hash = "sha256:a315db3e990c6b94aef323540b7f386485e8fc359dbd26af526c20dbe9068217", size = 289297, upload-time = "2023-11-15T00:28:33.172Z" }, 1349 - ] 1350 - 1351 - [[package]] 1352 - name = "zulip-bots" 1353 - version = "0.9.0" 1354 - source = { registry = "https://pypi.org/simple" } 1355 - dependencies = [ 1356 - { name = "beautifulsoup4" }, 1357 - { name = "html2text" }, 1358 - { name = "importlib-metadata", marker = "python_full_version < '3.10'" }, 1359 - { name = "lxml" }, 1360 - { name = "pip" }, 1361 - { name = "typing-extensions" }, 1362 - { name = "zulip" }, 1363 - ] 1364 - sdist = { url = "https://files.pythonhosted.org/packages/a5/39/6e60bea336fbfd4ad55dbdbb5fbd6d62dc32b08ad240688f119d145a29b3/zulip_bots-0.9.0.tar.gz", hash = "sha256:94925a4bd7c3558bf0e0cc3e83021d6a2f2139824745081abaa605a3d012e37a", size = 2268775, upload-time = "2023-11-15T00:28:36.507Z" } 1365 - wheels = [ 1366 - { url = "https://files.pythonhosted.org/packages/e6/c9/c242abc63de86d1a20b02e5d8e507c38d4889b9c01f663a5b80eb050effd/zulip_bots-0.9.0-py3-none-any.whl", hash = "sha256:1c46b011002fdf375f27fbf0c17394149e77ea36b33aa762b58368db14229e37", size = 2317628, upload-time = "2023-11-15T00:28:26.312Z" }, 1367 - ]

Compare changes