rudimentry _redirects support, incremental uploading for cli #3

closed
opened by nekomimi.pet targeting main

TODO _headers file place.wisp.settings lexicon as a lexiconal way of configuring this

+31 -1
README.md
··· 50 50 cargo build 51 51 ``` 52 52 53 + ## Features 54 + 55 + ### URL Redirects and Rewrites 56 + 57 + The hosting service supports Netlify-style `_redirects` files for managing URLs. Place a `_redirects` file in your site root to enable: 58 + 59 + - **301/302 Redirects**: Permanent and temporary URL redirects 60 + - **200 Rewrites**: Serve different content without changing the URL 61 + - **404 Custom Pages**: Custom error pages for specific paths 62 + - **Splats & Placeholders**: Dynamic path matching (`/blog/:year/:month/:day`, `/news/*`) 63 + - **Query Parameter Matching**: Redirect based on URL parameters 64 + - **Conditional Redirects**: Route by country, language, or cookie presence 65 + - **Force Redirects**: Override existing files with redirects 66 + 67 + Example `_redirects`: 68 + ``` 69 + # Single-page app routing (React, Vue, etc.) 70 + /* /index.html 200 71 + 72 + # Simple redirects 73 + /home / 74 + /old-blog/* /blog/:splat 75 + 76 + # API proxy 77 + /api/* https://api.example.com/:splat 200 78 + 79 + # Country-based routing 80 + / /us/ 302 Country=us 81 + / /uk/ 302 Country=gb 82 + ``` 83 + 53 84 ## Limits 54 85 55 86 - Max file size: 100MB (PDS limit) 56 - - Max site size: 300MB 57 87 - Max files: 2000 58 88 59 89 ## Tech Stack
-123
hosting-service/EXAMPLE.md
··· 1 - # HTML Path Rewriting Example 2 - 3 - This document demonstrates how HTML path rewriting works when serving sites via the `/s/:identifier/:site/*` route. 4 - 5 - ## Problem 6 - 7 - When you create a static site with absolute paths like `/style.css` or `/images/logo.png`, these paths work fine when served from the root domain. However, when served from a subdirectory like `/s/alice.bsky.social/mysite/`, these absolute paths break because they resolve to the server root instead of the site root. 8 - 9 - ## Solution 10 - 11 - The hosting service automatically rewrites absolute paths in HTML files to work correctly in the subdirectory context. 12 - 13 - ## Example 14 - 15 - **Original HTML file (index.html):** 16 - ```html 17 - <!DOCTYPE html> 18 - <html> 19 - <head> 20 - <meta charset="UTF-8"> 21 - <title>My Site</title> 22 - <link rel="stylesheet" href="/style.css"> 23 - <link rel="icon" href="/favicon.ico"> 24 - <script src="/app.js"></script> 25 - </head> 26 - <body> 27 - <header> 28 - <img src="/images/logo.png" alt="Logo"> 29 - <nav> 30 - <a href="/">Home</a> 31 - <a href="/about">About</a> 32 - <a href="/contact">Contact</a> 33 - </nav> 34 - </header> 35 - 36 - <main> 37 - <h1>Welcome</h1> 38 - <img src="/images/hero.jpg" 39 - srcset="/images/hero.jpg 1x, /images/hero@2x.jpg 2x" 40 - alt="Hero"> 41 - 42 - <form action="/submit" method="post"> 43 - <input type="text" name="email"> 44 - <button>Submit</button> 45 - </form> 46 - </main> 47 - 48 - <footer> 49 - <a href="https://example.com">External Link</a> 50 - <a href="#top">Back to Top</a> 51 - </footer> 52 - </body> 53 - </html> 54 - ``` 55 - 56 - **When accessed via `/s/alice.bsky.social/mysite/`, the HTML is rewritten to:** 57 - ```html 58 - <!DOCTYPE html> 59 - <html> 60 - <head> 61 - <meta charset="UTF-8"> 62 - <title>My Site</title> 63 - <link rel="stylesheet" href="/s/alice.bsky.social/mysite/style.css"> 64 - <link rel="icon" href="/s/alice.bsky.social/mysite/favicon.ico"> 65 - <script src="/s/alice.bsky.social/mysite/app.js"></script> 66 - </head> 67 - <body> 68 - <header> 69 - <img src="/s/alice.bsky.social/mysite/images/logo.png" alt="Logo"> 70 - <nav> 71 - <a href="/s/alice.bsky.social/mysite/">Home</a> 72 - <a href="/s/alice.bsky.social/mysite/about">About</a> 73 - <a href="/s/alice.bsky.social/mysite/contact">Contact</a> 74 - </nav> 75 - </header> 76 - 77 - <main> 78 - <h1>Welcome</h1> 79 - <img src="/s/alice.bsky.social/mysite/images/hero.jpg" 80 - srcset="/s/alice.bsky.social/mysite/images/hero.jpg 1x, /s/alice.bsky.social/mysite/images/hero@2x.jpg 2x" 81 - alt="Hero"> 82 - 83 - <form action="/s/alice.bsky.social/mysite/submit" method="post"> 84 - <input type="text" name="email"> 85 - <button>Submit</button> 86 - </form> 87 - </main> 88 - 89 - <footer> 90 - <a href="https://example.com">External Link</a> 91 - <a href="#top">Back to Top</a> 92 - </footer> 93 - </body> 94 - </html> 95 - ``` 96 - 97 - ## What's Preserved 98 - 99 - Notice that: 100 - - ✅ Absolute paths are rewritten: `/style.css` → `/s/alice.bsky.social/mysite/style.css` 101 - - ✅ External URLs are preserved: `https://example.com` stays the same 102 - - ✅ Anchors are preserved: `#top` stays the same 103 - - ✅ The rewriting is safe and won't break your site 104 - 105 - ## Supported Attributes 106 - 107 - The rewriter handles these HTML attributes: 108 - - `src` - images, scripts, iframes, videos, audio 109 - - `href` - links, stylesheets 110 - - `action` - forms 111 - - `data` - objects 112 - - `poster` - video posters 113 - - `srcset` - responsive images 114 - 115 - ## Testing Your Site 116 - 117 - To test if your site works with path rewriting: 118 - 119 - 1. Upload your site to your PDS as a `place.wisp.fs` record 120 - 2. Access it via: `https://hosting.wisp.place/s/YOUR_HANDLE/SITE_NAME/` 121 - 3. Check that all resources load correctly 122 - 123 - If you're using relative paths already (like `./style.css` or `../images/logo.png`), they'll work without any rewriting.
+134
hosting-service/example-_redirects
··· 1 + # Example _redirects file for Wisp hosting 2 + # Place this file in the root directory of your site as "_redirects" 3 + # Lines starting with # are comments 4 + 5 + # =================================== 6 + # SIMPLE REDIRECTS 7 + # =================================== 8 + 9 + # Redirect home page 10 + # /home / 11 + 12 + # Redirect old URLs to new ones 13 + # /old-blog /blog 14 + # /about-us /about 15 + 16 + # =================================== 17 + # SPLAT REDIRECTS (WILDCARDS) 18 + # =================================== 19 + 20 + # Redirect entire directories 21 + # /news/* /blog/:splat 22 + # /old-site/* /new-site/:splat 23 + 24 + # =================================== 25 + # PLACEHOLDER REDIRECTS 26 + # =================================== 27 + 28 + # Restructure blog URLs 29 + # /blog/:year/:month/:day/:slug /posts/:year-:month-:day/:slug 30 + 31 + # Capture multiple parameters 32 + # /products/:category/:id /shop/:category/item/:id 33 + 34 + # =================================== 35 + # STATUS CODES 36 + # =================================== 37 + 38 + # Permanent redirect (301) - default if not specified 39 + # /permanent-move /new-location 301 40 + 41 + # Temporary redirect (302) 42 + # /temp-redirect /temp-location 302 43 + 44 + # Rewrite (200) - serves different content, URL stays the same 45 + # /api/* /functions/:splat 200 46 + 47 + # Custom 404 page 48 + # /shop/* /shop-closed.html 404 49 + 50 + # =================================== 51 + # FORCE REDIRECTS 52 + # =================================== 53 + 54 + # Force redirect even if file exists (note the ! after status code) 55 + # /override-file /other-file.html 200! 56 + 57 + # =================================== 58 + # CONDITIONAL REDIRECTS 59 + # =================================== 60 + 61 + # Country-based redirects (ISO 3166-1 alpha-2 codes) 62 + # / /us/ 302 Country=us 63 + # / /uk/ 302 Country=gb 64 + # / /anz/ 302 Country=au,nz 65 + 66 + # Language-based redirects 67 + # /products /en/products 301 Language=en 68 + # /products /de/products 301 Language=de 69 + # /products /fr/products 301 Language=fr 70 + 71 + # Cookie-based redirects (checks if cookie exists) 72 + # /* /legacy/:splat 200 Cookie=is_legacy 73 + 74 + # =================================== 75 + # QUERY PARAMETERS 76 + # =================================== 77 + 78 + # Match specific query parameters 79 + # /store id=:id /blog/:id 301 80 + 81 + # Multiple parameters 82 + # /search q=:query category=:cat /find/:cat/:query 301 83 + 84 + # =================================== 85 + # DOMAIN-LEVEL REDIRECTS 86 + # =================================== 87 + 88 + # Redirect to different domain (must include protocol) 89 + # /external https://example.com/path 90 + 91 + # Redirect entire subdomain 92 + # http://blog.example.com/* https://example.com/blog/:splat 301! 93 + # https://blog.example.com/* https://example.com/blog/:splat 301! 94 + 95 + # =================================== 96 + # COMMON PATTERNS 97 + # =================================== 98 + 99 + # Remove .html extensions 100 + # /page.html /page 101 + 102 + # Add trailing slash 103 + # /about /about/ 104 + 105 + # Single-page app fallback (serve index.html for all paths) 106 + # /* /index.html 200 107 + 108 + # API proxy 109 + # /api/* https://api.example.com/:splat 200 110 + 111 + # =================================== 112 + # CUSTOM ERROR PAGES 113 + # =================================== 114 + 115 + # Language-specific 404 pages 116 + # /en/* /en/404.html 404 117 + # /de/* /de/404.html 404 118 + 119 + # Section-specific 404 pages 120 + # /shop/* /shop/not-found.html 404 121 + # /blog/* /blog/404.html 404 122 + 123 + # =================================== 124 + # NOTES 125 + # =================================== 126 + # 127 + # - Rules are processed in order (first match wins) 128 + # - More specific rules should come before general ones 129 + # - Splats (*) can only be used at the end of a path 130 + # - Query parameters are automatically preserved for 200, 301, 302 131 + # - Trailing slashes are normalized (/ and no / are treated the same) 132 + # - Default status code is 301 if not specified 133 + # 134 +
+215
hosting-service/src/lib/redirects.test.ts
··· 1 + import { describe, it, expect } from 'bun:test' 2 + import { parseRedirectsFile, matchRedirectRule } from './redirects'; 3 + 4 + describe('parseRedirectsFile', () => { 5 + it('should parse simple redirects', () => { 6 + const content = ` 7 + # Comment line 8 + /old-path /new-path 9 + /home / 301 10 + `; 11 + const rules = parseRedirectsFile(content); 12 + expect(rules).toHaveLength(2); 13 + expect(rules[0]).toMatchObject({ 14 + from: '/old-path', 15 + to: '/new-path', 16 + status: 301, 17 + force: false, 18 + }); 19 + expect(rules[1]).toMatchObject({ 20 + from: '/home', 21 + to: '/', 22 + status: 301, 23 + force: false, 24 + }); 25 + }); 26 + 27 + it('should parse redirects with different status codes', () => { 28 + const content = ` 29 + /temp-redirect /target 302 30 + /rewrite /content 200 31 + /not-found /404 404 32 + `; 33 + const rules = parseRedirectsFile(content); 34 + expect(rules).toHaveLength(3); 35 + expect(rules[0]?.status).toBe(302); 36 + expect(rules[1]?.status).toBe(200); 37 + expect(rules[2]?.status).toBe(404); 38 + }); 39 + 40 + it('should parse force redirects', () => { 41 + const content = `/force-path /target 301!`; 42 + const rules = parseRedirectsFile(content); 43 + expect(rules[0]?.force).toBe(true); 44 + expect(rules[0]?.status).toBe(301); 45 + }); 46 + 47 + it('should parse splat redirects', () => { 48 + const content = `/news/* /blog/:splat`; 49 + const rules = parseRedirectsFile(content); 50 + expect(rules[0]?.from).toBe('/news/*'); 51 + expect(rules[0]?.to).toBe('/blog/:splat'); 52 + }); 53 + 54 + it('should parse placeholder redirects', () => { 55 + const content = `/blog/:year/:month/:day /posts/:year-:month-:day`; 56 + const rules = parseRedirectsFile(content); 57 + expect(rules[0]?.from).toBe('/blog/:year/:month/:day'); 58 + expect(rules[0]?.to).toBe('/posts/:year-:month-:day'); 59 + }); 60 + 61 + it('should parse country-based redirects', () => { 62 + const content = `/ /anz 302 Country=au,nz`; 63 + const rules = parseRedirectsFile(content); 64 + expect(rules[0]?.conditions?.country).toEqual(['au', 'nz']); 65 + }); 66 + 67 + it('should parse language-based redirects', () => { 68 + const content = `/products /en/products 301 Language=en`; 69 + const rules = parseRedirectsFile(content); 70 + expect(rules[0]?.conditions?.language).toEqual(['en']); 71 + }); 72 + 73 + it('should parse cookie-based redirects', () => { 74 + const content = `/* /legacy/:splat 200 Cookie=is_legacy,my_cookie`; 75 + const rules = parseRedirectsFile(content); 76 + expect(rules[0]?.conditions?.cookie).toEqual(['is_legacy', 'my_cookie']); 77 + }); 78 + }); 79 + 80 + describe('matchRedirectRule', () => { 81 + it('should match exact paths', () => { 82 + const rules = parseRedirectsFile('/old-path /new-path'); 83 + const match = matchRedirectRule('/old-path', rules); 84 + expect(match).toBeTruthy(); 85 + expect(match?.targetPath).toBe('/new-path'); 86 + expect(match?.status).toBe(301); 87 + }); 88 + 89 + it('should match paths with trailing slash', () => { 90 + const rules = parseRedirectsFile('/old-path /new-path'); 91 + const match = matchRedirectRule('/old-path/', rules); 92 + expect(match).toBeTruthy(); 93 + expect(match?.targetPath).toBe('/new-path'); 94 + }); 95 + 96 + it('should match splat patterns', () => { 97 + const rules = parseRedirectsFile('/news/* /blog/:splat'); 98 + const match = matchRedirectRule('/news/2024/01/15/my-post', rules); 99 + expect(match).toBeTruthy(); 100 + expect(match?.targetPath).toBe('/blog/2024/01/15/my-post'); 101 + }); 102 + 103 + it('should match placeholder patterns', () => { 104 + const rules = parseRedirectsFile('/blog/:year/:month/:day /posts/:year-:month-:day'); 105 + const match = matchRedirectRule('/blog/2024/01/15', rules); 106 + expect(match).toBeTruthy(); 107 + expect(match?.targetPath).toBe('/posts/2024-01-15'); 108 + }); 109 + 110 + it('should preserve query strings for 301/302 redirects', () => { 111 + const rules = parseRedirectsFile('/old /new 301'); 112 + const match = matchRedirectRule('/old', rules, { 113 + queryParams: { foo: 'bar', baz: 'qux' }, 114 + }); 115 + expect(match?.targetPath).toContain('?'); 116 + expect(match?.targetPath).toContain('foo=bar'); 117 + expect(match?.targetPath).toContain('baz=qux'); 118 + }); 119 + 120 + it('should match based on query parameters', () => { 121 + const rules = parseRedirectsFile('/store id=:id /blog/:id 301'); 122 + const match = matchRedirectRule('/store', rules, { 123 + queryParams: { id: 'my-post' }, 124 + }); 125 + expect(match).toBeTruthy(); 126 + expect(match?.targetPath).toContain('/blog/my-post'); 127 + }); 128 + 129 + it('should not match when query params are missing', () => { 130 + const rules = parseRedirectsFile('/store id=:id /blog/:id 301'); 131 + const match = matchRedirectRule('/store', rules, { 132 + queryParams: {}, 133 + }); 134 + expect(match).toBeNull(); 135 + }); 136 + 137 + it('should match based on country header', () => { 138 + const rules = parseRedirectsFile('/ /aus 302 Country=au'); 139 + const match = matchRedirectRule('/', rules, { 140 + headers: { 'cf-ipcountry': 'AU' }, 141 + }); 142 + expect(match).toBeTruthy(); 143 + expect(match?.targetPath).toBe('/aus'); 144 + }); 145 + 146 + it('should not match wrong country', () => { 147 + const rules = parseRedirectsFile('/ /aus 302 Country=au'); 148 + const match = matchRedirectRule('/', rules, { 149 + headers: { 'cf-ipcountry': 'US' }, 150 + }); 151 + expect(match).toBeNull(); 152 + }); 153 + 154 + it('should match based on language header', () => { 155 + const rules = parseRedirectsFile('/products /en/products 301 Language=en'); 156 + const match = matchRedirectRule('/products', rules, { 157 + headers: { 'accept-language': 'en-US,en;q=0.9' }, 158 + }); 159 + expect(match).toBeTruthy(); 160 + expect(match?.targetPath).toBe('/en/products'); 161 + }); 162 + 163 + it('should match based on cookie presence', () => { 164 + const rules = parseRedirectsFile('/* /legacy/:splat 200 Cookie=is_legacy'); 165 + const match = matchRedirectRule('/some-path', rules, { 166 + cookies: { is_legacy: 'true' }, 167 + }); 168 + expect(match).toBeTruthy(); 169 + expect(match?.targetPath).toBe('/legacy/some-path'); 170 + }); 171 + 172 + it('should return first matching rule', () => { 173 + const content = ` 174 + /path /first 175 + /path /second 176 + `; 177 + const rules = parseRedirectsFile(content); 178 + const match = matchRedirectRule('/path', rules); 179 + expect(match?.targetPath).toBe('/first'); 180 + }); 181 + 182 + it('should match more specific rules before general ones', () => { 183 + const content = ` 184 + /jobs/customer-ninja /careers/support 185 + /jobs/* /careers/:splat 186 + `; 187 + const rules = parseRedirectsFile(content); 188 + 189 + const match1 = matchRedirectRule('/jobs/customer-ninja', rules); 190 + expect(match1?.targetPath).toBe('/careers/support'); 191 + 192 + const match2 = matchRedirectRule('/jobs/developer', rules); 193 + expect(match2?.targetPath).toBe('/careers/developer'); 194 + }); 195 + 196 + it('should handle SPA routing pattern', () => { 197 + const rules = parseRedirectsFile('/* /index.html 200'); 198 + 199 + // Should match any path 200 + const match1 = matchRedirectRule('/about', rules); 201 + expect(match1).toBeTruthy(); 202 + expect(match1?.targetPath).toBe('/index.html'); 203 + expect(match1?.status).toBe(200); 204 + 205 + const match2 = matchRedirectRule('/users/123/profile', rules); 206 + expect(match2).toBeTruthy(); 207 + expect(match2?.targetPath).toBe('/index.html'); 208 + expect(match2?.status).toBe(200); 209 + 210 + const match3 = matchRedirectRule('/', rules); 211 + expect(match3).toBeTruthy(); 212 + expect(match3?.targetPath).toBe('/index.html'); 213 + }); 214 + }); 215 +
+413
hosting-service/src/lib/redirects.ts
··· 1 + import { readFile } from 'fs/promises'; 2 + import { existsSync } from 'fs'; 3 + 4 + export interface RedirectRule { 5 + from: string; 6 + to: string; 7 + status: number; 8 + force: boolean; 9 + conditions?: { 10 + country?: string[]; 11 + language?: string[]; 12 + role?: string[]; 13 + cookie?: string[]; 14 + }; 15 + // For pattern matching 16 + fromPattern?: RegExp; 17 + fromParams?: string[]; // Named parameters from the pattern 18 + queryParams?: Record<string, string>; // Expected query parameters 19 + } 20 + 21 + export interface RedirectMatch { 22 + rule: RedirectRule; 23 + targetPath: string; 24 + status: number; 25 + } 26 + 27 + /** 28 + * Parse a _redirects file into an array of redirect rules 29 + */ 30 + export function parseRedirectsFile(content: string): RedirectRule[] { 31 + const lines = content.split('\n'); 32 + const rules: RedirectRule[] = []; 33 + 34 + for (let lineNum = 0; lineNum < lines.length; lineNum++) { 35 + const lineRaw = lines[lineNum]; 36 + if (!lineRaw) continue; 37 + 38 + const line = lineRaw.trim(); 39 + 40 + // Skip empty lines and comments 41 + if (!line || line.startsWith('#')) { 42 + continue; 43 + } 44 + 45 + try { 46 + const rule = parseRedirectLine(line); 47 + if (rule && rule.fromPattern) { 48 + rules.push(rule); 49 + } 50 + } catch (err) { 51 + console.warn(`Failed to parse redirect rule on line ${lineNum + 1}: ${line}`, err); 52 + } 53 + } 54 + 55 + return rules; 56 + } 57 + 58 + /** 59 + * Parse a single redirect rule line 60 + * Format: /from [query_params] /to [status] [conditions] 61 + */ 62 + function parseRedirectLine(line: string): RedirectRule | null { 63 + // Split by whitespace, but respect quoted strings (though not commonly used) 64 + const parts = line.split(/\s+/); 65 + 66 + if (parts.length < 2) { 67 + return null; 68 + } 69 + 70 + let idx = 0; 71 + const from = parts[idx++]; 72 + 73 + if (!from) { 74 + return null; 75 + } 76 + 77 + let status = 301; // Default status 78 + let force = false; 79 + const conditions: NonNullable<RedirectRule['conditions']> = {}; 80 + const queryParams: Record<string, string> = {}; 81 + 82 + // Parse query parameters that come before the destination path 83 + // They look like: key=:value (and don't start with /) 84 + while (idx < parts.length) { 85 + const part = parts[idx]; 86 + if (!part) { 87 + idx++; 88 + continue; 89 + } 90 + 91 + // If it starts with / or http, it's the destination path 92 + if (part.startsWith('/') || part.startsWith('http://') || part.startsWith('https://')) { 93 + break; 94 + } 95 + 96 + // If it contains = and comes before the destination, it's a query param 97 + if (part.includes('=')) { 98 + const splitIndex = part.indexOf('='); 99 + const key = part.slice(0, splitIndex); 100 + const value = part.slice(splitIndex + 1); 101 + 102 + if (key && value) { 103 + queryParams[key] = value; 104 + } 105 + idx++; 106 + } else { 107 + // Not a query param, must be destination or something else 108 + break; 109 + } 110 + } 111 + 112 + // Next part should be the destination 113 + if (idx >= parts.length) { 114 + return null; 115 + } 116 + 117 + const to = parts[idx++]; 118 + if (!to) { 119 + return null; 120 + } 121 + 122 + // Parse remaining parts for status code and conditions 123 + for (let i = idx; i < parts.length; i++) { 124 + const part = parts[i]; 125 + 126 + if (!part) continue; 127 + 128 + // Check for status code (with optional ! for force) 129 + if (/^\d+!?$/.test(part)) { 130 + if (part.endsWith('!')) { 131 + force = true; 132 + status = parseInt(part.slice(0, -1)); 133 + } else { 134 + status = parseInt(part); 135 + } 136 + continue; 137 + } 138 + 139 + // Check for condition parameters (Country=, Language=, Role=, Cookie=) 140 + if (part.includes('=')) { 141 + const splitIndex = part.indexOf('='); 142 + const key = part.slice(0, splitIndex); 143 + const value = part.slice(splitIndex + 1); 144 + 145 + if (!key || !value) continue; 146 + 147 + const keyLower = key.toLowerCase(); 148 + 149 + if (keyLower === 'country') { 150 + conditions.country = value.split(',').map(v => v.trim().toLowerCase()); 151 + } else if (keyLower === 'language') { 152 + conditions.language = value.split(',').map(v => v.trim().toLowerCase()); 153 + } else if (keyLower === 'role') { 154 + conditions.role = value.split(',').map(v => v.trim()); 155 + } else if (keyLower === 'cookie') { 156 + conditions.cookie = value.split(',').map(v => v.trim().toLowerCase()); 157 + } 158 + } 159 + } 160 + 161 + // Parse the 'from' pattern 162 + const { pattern, params } = convertPathToRegex(from); 163 + 164 + return { 165 + from, 166 + to, 167 + status, 168 + force, 169 + conditions: Object.keys(conditions).length > 0 ? conditions : undefined, 170 + queryParams: Object.keys(queryParams).length > 0 ? queryParams : undefined, 171 + fromPattern: pattern, 172 + fromParams: params, 173 + }; 174 + } 175 + 176 + /** 177 + * Convert a path pattern with placeholders and splats to a regex 178 + * Examples: 179 + * /blog/:year/:month/:day -> captures year, month, day 180 + * /news/* -> captures splat 181 + */ 182 + function convertPathToRegex(pattern: string): { pattern: RegExp; params: string[] } { 183 + const params: string[] = []; 184 + let regexStr = '^'; 185 + 186 + // Split by query string if present 187 + const pathPart = pattern.split('?')[0] || pattern; 188 + 189 + // Escape special regex characters except * and : 190 + let escaped = pathPart.replace(/[.+^${}()|[\]\\]/g, '\\$&'); 191 + 192 + // Replace :param with named capture groups 193 + escaped = escaped.replace(/:([a-zA-Z_][a-zA-Z0-9_]*)/g, (match, paramName) => { 194 + params.push(paramName); 195 + // Match path segment (everything except / and ?) 196 + return '([^/?]+)'; 197 + }); 198 + 199 + // Replace * with splat capture (matches everything including /) 200 + if (escaped.includes('*')) { 201 + escaped = escaped.replace(/\*/g, '(.*)'); 202 + params.push('splat'); 203 + } 204 + 205 + regexStr += escaped; 206 + 207 + // Make trailing slash optional 208 + if (!regexStr.endsWith('.*')) { 209 + regexStr += '/?'; 210 + } 211 + 212 + regexStr += '$'; 213 + 214 + return { 215 + pattern: new RegExp(regexStr), 216 + params, 217 + }; 218 + } 219 + 220 + /** 221 + * Match a request path against redirect rules 222 + */ 223 + export function matchRedirectRule( 224 + requestPath: string, 225 + rules: RedirectRule[], 226 + context?: { 227 + queryParams?: Record<string, string>; 228 + headers?: Record<string, string>; 229 + cookies?: Record<string, string>; 230 + } 231 + ): RedirectMatch | null { 232 + // Normalize path: ensure leading slash, remove trailing slash (except for root) 233 + let normalizedPath = requestPath.startsWith('/') ? requestPath : `/${requestPath}`; 234 + 235 + for (const rule of rules) { 236 + // Check query parameter conditions first (if any) 237 + if (rule.queryParams) { 238 + // If rule requires query params but none provided, skip this rule 239 + if (!context?.queryParams) { 240 + continue; 241 + } 242 + 243 + const queryMatches = Object.entries(rule.queryParams).every(([key, value]) => { 244 + const actualValue = context.queryParams?.[key]; 245 + return actualValue !== undefined; 246 + }); 247 + 248 + if (!queryMatches) { 249 + continue; 250 + } 251 + } 252 + 253 + // Check conditional redirects (country, language, role, cookie) 254 + if (rule.conditions) { 255 + if (rule.conditions.country && context?.headers) { 256 + const cfCountry = context.headers['cf-ipcountry']; 257 + const xCountry = context.headers['x-country']; 258 + const country = (cfCountry?.toLowerCase() || xCountry?.toLowerCase()); 259 + if (!country || !rule.conditions.country.includes(country)) { 260 + continue; 261 + } 262 + } 263 + 264 + if (rule.conditions.language && context?.headers) { 265 + const acceptLang = context.headers['accept-language']; 266 + if (!acceptLang) { 267 + continue; 268 + } 269 + // Parse accept-language header (simplified) 270 + const langs = acceptLang.split(',').map(l => { 271 + const langPart = l.split(';')[0]; 272 + return langPart ? langPart.trim().toLowerCase() : ''; 273 + }).filter(l => l !== ''); 274 + const hasMatch = rule.conditions.language.some(lang => 275 + langs.some(l => l === lang || l.startsWith(lang + '-')) 276 + ); 277 + if (!hasMatch) { 278 + continue; 279 + } 280 + } 281 + 282 + if (rule.conditions.cookie && context?.cookies) { 283 + const hasCookie = rule.conditions.cookie.some(cookieName => 284 + context.cookies && cookieName in context.cookies 285 + ); 286 + if (!hasCookie) { 287 + continue; 288 + } 289 + } 290 + 291 + // Role-based redirects would need JWT verification - skip for now 292 + if (rule.conditions.role) { 293 + continue; 294 + } 295 + } 296 + 297 + // Match the path pattern 298 + const match = rule.fromPattern?.exec(normalizedPath); 299 + if (!match) { 300 + continue; 301 + } 302 + 303 + // Build the target path by replacing placeholders 304 + let targetPath = rule.to; 305 + 306 + // Replace captured parameters 307 + if (rule.fromParams && match.length > 1) { 308 + for (let i = 0; i < rule.fromParams.length; i++) { 309 + const paramName = rule.fromParams[i]; 310 + const paramValue = match[i + 1]; 311 + 312 + if (!paramName || !paramValue) continue; 313 + 314 + if (paramName === 'splat') { 315 + targetPath = targetPath.replace(':splat', paramValue); 316 + } else { 317 + targetPath = targetPath.replace(`:${paramName}`, paramValue); 318 + } 319 + } 320 + } 321 + 322 + // Handle query parameter replacements 323 + if (rule.queryParams && context?.queryParams) { 324 + for (const [key, placeholder] of Object.entries(rule.queryParams)) { 325 + const actualValue = context.queryParams[key]; 326 + if (actualValue && placeholder && placeholder.startsWith(':')) { 327 + const paramName = placeholder.slice(1); 328 + if (paramName) { 329 + targetPath = targetPath.replace(`:${paramName}`, actualValue); 330 + } 331 + } 332 + } 333 + } 334 + 335 + // Preserve query string for 200, 301, 302 redirects (unless target already has one) 336 + if ([200, 301, 302].includes(rule.status) && context?.queryParams && !targetPath.includes('?')) { 337 + const queryString = Object.entries(context.queryParams) 338 + .map(([k, v]) => `${encodeURIComponent(k)}=${encodeURIComponent(v)}`) 339 + .join('&'); 340 + if (queryString) { 341 + targetPath += `?${queryString}`; 342 + } 343 + } 344 + 345 + return { 346 + rule, 347 + targetPath, 348 + status: rule.status, 349 + }; 350 + } 351 + 352 + return null; 353 + } 354 + 355 + /** 356 + * Load redirect rules from a cached site 357 + */ 358 + export async function loadRedirectRules(did: string, rkey: string): Promise<RedirectRule[]> { 359 + const CACHE_DIR = process.env.CACHE_DIR || './cache/sites'; 360 + const redirectsPath = `${CACHE_DIR}/${did}/${rkey}/_redirects`; 361 + 362 + if (!existsSync(redirectsPath)) { 363 + return []; 364 + } 365 + 366 + try { 367 + const content = await readFile(redirectsPath, 'utf-8'); 368 + return parseRedirectsFile(content); 369 + } catch (err) { 370 + console.error('Failed to load _redirects file', err); 371 + return []; 372 + } 373 + } 374 + 375 + /** 376 + * Parse cookies from Cookie header 377 + */ 378 + export function parseCookies(cookieHeader?: string): Record<string, string> { 379 + if (!cookieHeader) return {}; 380 + 381 + const cookies: Record<string, string> = {}; 382 + const parts = cookieHeader.split(';'); 383 + 384 + for (const part of parts) { 385 + const [key, ...valueParts] = part.split('='); 386 + if (key && valueParts.length > 0) { 387 + cookies[key.trim()] = valueParts.join('=').trim(); 388 + } 389 + } 390 + 391 + return cookies; 392 + } 393 + 394 + /** 395 + * Parse query string into object 396 + */ 397 + export function parseQueryString(url: string): Record<string, string> { 398 + const queryStart = url.indexOf('?'); 399 + if (queryStart === -1) return {}; 400 + 401 + const queryString = url.slice(queryStart + 1); 402 + const params: Record<string, string> = {}; 403 + 404 + for (const pair of queryString.split('&')) { 405 + const [key, value] = pair.split('='); 406 + if (key) { 407 + params[decodeURIComponent(key)] = value ? decodeURIComponent(value) : ''; 408 + } 409 + } 410 + 411 + return params; 412 + } 413 +
+168 -6
hosting-service/src/server.ts
··· 7 7 import { lookup } from 'mime-types'; 8 8 import { logger, observabilityMiddleware, observabilityErrorHandler, logCollector, errorTracker, metricsCollector } from './lib/observability'; 9 9 import { fileCache, metadataCache, rewrittenHtmlCache, getCacheKey, type FileMetadata } from './lib/cache'; 10 + import { loadRedirectRules, matchRedirectRule, parseCookies, parseQueryString, type RedirectRule } from './lib/redirects'; 10 11 11 12 const BASE_HOST = process.env.BASE_HOST || 'wisp.place'; 12 13 ··· 35 36 } 36 37 } 37 38 39 + // Cache for redirect rules (per site) 40 + const redirectRulesCache = new Map<string, RedirectRule[]>(); 41 + 42 + /** 43 + * Clear redirect rules cache for a specific site 44 + * Should be called when a site is updated/recached 45 + */ 46 + export function clearRedirectRulesCache(did: string, rkey: string) { 47 + const cacheKey = `${did}:${rkey}`; 48 + redirectRulesCache.delete(cacheKey); 49 + } 50 + 38 51 // Helper to serve files from cache 39 - async function serveFromCache(did: string, rkey: string, filePath: string) { 52 + async function serveFromCache( 53 + did: string, 54 + rkey: string, 55 + filePath: string, 56 + fullUrl?: string, 57 + headers?: Record<string, string> 58 + ) { 59 + // Check for redirect rules first 60 + const redirectCacheKey = `${did}:${rkey}`; 61 + let redirectRules = redirectRulesCache.get(redirectCacheKey); 62 + 63 + if (redirectRules === undefined) { 64 + // Load rules for the first time 65 + redirectRules = await loadRedirectRules(did, rkey); 66 + redirectRulesCache.set(redirectCacheKey, redirectRules); 67 + } 68 + 69 + // Apply redirect rules if any exist 70 + if (redirectRules.length > 0) { 71 + const requestPath = '/' + (filePath || ''); 72 + const queryParams = fullUrl ? parseQueryString(fullUrl) : {}; 73 + const cookies = parseCookies(headers?.['cookie']); 74 + 75 + const redirectMatch = matchRedirectRule(requestPath, redirectRules, { 76 + queryParams, 77 + headers, 78 + cookies, 79 + }); 80 + 81 + if (redirectMatch) { 82 + const { targetPath, status } = redirectMatch; 83 + 84 + // Handle different status codes 85 + if (status === 200) { 86 + // Rewrite: serve different content but keep URL the same 87 + // Remove leading slash for internal path resolution 88 + const rewritePath = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 89 + return serveFileInternal(did, rkey, rewritePath); 90 + } else if (status === 301 || status === 302) { 91 + // External redirect: change the URL 92 + return new Response(null, { 93 + status, 94 + headers: { 95 + 'Location': targetPath, 96 + 'Cache-Control': status === 301 ? 'public, max-age=31536000' : 'public, max-age=0', 97 + }, 98 + }); 99 + } else if (status === 404) { 100 + // Custom 404 page 101 + const custom404Path = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 102 + const response = await serveFileInternal(did, rkey, custom404Path); 103 + // Override status to 404 104 + return new Response(response.body, { 105 + status: 404, 106 + headers: response.headers, 107 + }); 108 + } 109 + } 110 + } 111 + 112 + // No redirect matched, serve normally 113 + return serveFileInternal(did, rkey, filePath); 114 + } 115 + 116 + // Internal function to serve a file (used by both normal serving and rewrites) 117 + async function serveFileInternal(did: string, rkey: string, filePath: string) { 40 118 // Default to index.html if path is empty or ends with / 41 119 let requestPath = filePath || 'index.html'; 42 120 if (requestPath.endsWith('/')) { ··· 138 216 did: string, 139 217 rkey: string, 140 218 filePath: string, 141 - basePath: string 219 + basePath: string, 220 + fullUrl?: string, 221 + headers?: Record<string, string> 142 222 ) { 223 + // Check for redirect rules first 224 + const redirectCacheKey = `${did}:${rkey}`; 225 + let redirectRules = redirectRulesCache.get(redirectCacheKey); 226 + 227 + if (redirectRules === undefined) { 228 + // Load rules for the first time 229 + redirectRules = await loadRedirectRules(did, rkey); 230 + redirectRulesCache.set(redirectCacheKey, redirectRules); 231 + } 232 + 233 + // Apply redirect rules if any exist 234 + if (redirectRules.length > 0) { 235 + const requestPath = '/' + (filePath || ''); 236 + const queryParams = fullUrl ? parseQueryString(fullUrl) : {}; 237 + const cookies = parseCookies(headers?.['cookie']); 238 + 239 + const redirectMatch = matchRedirectRule(requestPath, redirectRules, { 240 + queryParams, 241 + headers, 242 + cookies, 243 + }); 244 + 245 + if (redirectMatch) { 246 + const { targetPath, status } = redirectMatch; 247 + 248 + // Handle different status codes 249 + if (status === 200) { 250 + // Rewrite: serve different content but keep URL the same 251 + const rewritePath = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 252 + return serveFileInternalWithRewrite(did, rkey, rewritePath, basePath); 253 + } else if (status === 301 || status === 302) { 254 + // External redirect: change the URL 255 + // For sites.wisp.place, we need to adjust the target path to include the base path 256 + // unless it's an absolute URL 257 + let redirectTarget = targetPath; 258 + if (!targetPath.startsWith('http://') && !targetPath.startsWith('https://')) { 259 + redirectTarget = basePath + (targetPath.startsWith('/') ? targetPath.slice(1) : targetPath); 260 + } 261 + return new Response(null, { 262 + status, 263 + headers: { 264 + 'Location': redirectTarget, 265 + 'Cache-Control': status === 301 ? 'public, max-age=31536000' : 'public, max-age=0', 266 + }, 267 + }); 268 + } else if (status === 404) { 269 + // Custom 404 page 270 + const custom404Path = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 271 + const response = await serveFileInternalWithRewrite(did, rkey, custom404Path, basePath); 272 + // Override status to 404 273 + return new Response(response.body, { 274 + status: 404, 275 + headers: response.headers, 276 + }); 277 + } 278 + } 279 + } 280 + 281 + // No redirect matched, serve normally 282 + return serveFileInternalWithRewrite(did, rkey, filePath, basePath); 283 + } 284 + 285 + // Internal function to serve a file with rewriting 286 + async function serveFileInternalWithRewrite(did: string, rkey: string, filePath: string, basePath: string) { 143 287 // Default to index.html if path is empty or ends with / 144 288 let requestPath = filePath || 'index.html'; 145 289 if (requestPath.endsWith('/')) { ··· 317 461 318 462 try { 319 463 await downloadAndCacheSite(did, rkey, siteData.record, pdsEndpoint, siteData.cid); 464 + // Clear redirect rules cache since the site was updated 465 + clearRedirectRulesCache(did, rkey); 320 466 logger.info('Site cached successfully', { did, rkey }); 321 467 return true; 322 468 } catch (err) { ··· 384 530 385 531 // Serve with HTML path rewriting to handle absolute paths 386 532 const basePath = `/${identifier}/${site}/`; 387 - return serveFromCacheWithRewrite(did, site, filePath, basePath); 533 + const headers: Record<string, string> = {}; 534 + c.req.raw.headers.forEach((value, key) => { 535 + headers[key.toLowerCase()] = value; 536 + }); 537 + return serveFromCacheWithRewrite(did, site, filePath, basePath, c.req.url, headers); 388 538 } 389 539 390 540 // Check if this is a DNS hash subdomain ··· 420 570 return c.text('Site not found', 404); 421 571 } 422 572 423 - return serveFromCache(customDomain.did, rkey, path); 573 + const headers: Record<string, string> = {}; 574 + c.req.raw.headers.forEach((value, key) => { 575 + headers[key.toLowerCase()] = value; 576 + }); 577 + return serveFromCache(customDomain.did, rkey, path, c.req.url, headers); 424 578 } 425 579 426 580 // Route 2: Registered subdomains - /*.wisp.place/* ··· 444 598 return c.text('Site not found', 404); 445 599 } 446 600 447 - return serveFromCache(domainInfo.did, rkey, path); 601 + const headers: Record<string, string> = {}; 602 + c.req.raw.headers.forEach((value, key) => { 603 + headers[key.toLowerCase()] = value; 604 + }); 605 + return serveFromCache(domainInfo.did, rkey, path, c.req.url, headers); 448 606 } 449 607 450 608 // Route 1: Custom domains - /* ··· 467 625 return c.text('Site not found', 404); 468 626 } 469 627 470 - return serveFromCache(customDomain.did, rkey, path); 628 + const headers: Record<string, string> = {}; 629 + c.req.raw.headers.forEach((value, key) => { 630 + headers[key.toLowerCase()] = value; 631 + }); 632 + return serveFromCache(customDomain.did, rkey, path, c.req.url, headers); 471 633 }); 472 634 473 635 // Internal observability endpoints (for admin panel)
+1
cli/.gitignore
··· 1 + test/ 1 2 .DS_STORE 2 3 jacquard/ 3 4 binaries/
+3
cli/Cargo.lock
··· 4385 4385 "jacquard-oauth", 4386 4386 "miette", 4387 4387 "mime_guess", 4388 + "multibase", 4389 + "multihash", 4388 4390 "reqwest", 4389 4391 "rustversion", 4390 4392 "serde", 4391 4393 "serde_json", 4394 + "sha2", 4392 4395 "shellexpand", 4393 4396 "tokio", 4394 4397 "walkdir",
+3
cli/Cargo.toml
··· 30 30 mime_guess = "2.0" 31 31 bytes = "1.10" 32 32 futures = "0.3.31" 33 + multihash = "0.19.3" 34 + multibase = "0.9" 35 + sha2 = "0.10"
+92
cli/src/blob_map.rs
··· 1 + use jacquard_common::types::blob::BlobRef; 2 + use jacquard_common::IntoStatic; 3 + use std::collections::HashMap; 4 + 5 + use crate::place_wisp::fs::{Directory, EntryNode}; 6 + 7 + /// Extract blob information from a directory tree 8 + /// Returns a map of file paths to their blob refs and CIDs 9 + /// 10 + /// This mirrors the TypeScript implementation in src/lib/wisp-utils.ts lines 275-302 11 + pub fn extract_blob_map( 12 + directory: &Directory, 13 + ) -> HashMap<String, (BlobRef<'static>, String)> { 14 + extract_blob_map_recursive(directory, String::new()) 15 + } 16 + 17 + fn extract_blob_map_recursive( 18 + directory: &Directory, 19 + current_path: String, 20 + ) -> HashMap<String, (BlobRef<'static>, String)> { 21 + let mut blob_map = HashMap::new(); 22 + 23 + for entry in &directory.entries { 24 + let full_path = if current_path.is_empty() { 25 + entry.name.to_string() 26 + } else { 27 + format!("{}/{}", current_path, entry.name) 28 + }; 29 + 30 + match &entry.node { 31 + EntryNode::File(file_node) => { 32 + // Extract CID from blob ref 33 + // BlobRef is an enum with Blob variant, which has a ref field (CidLink) 34 + let blob_ref = &file_node.blob; 35 + let cid_string = blob_ref.blob().r#ref.to_string(); 36 + 37 + // Store both normalized and full paths 38 + // Normalize by removing base folder prefix (e.g., "cobblemon/index.html" -> "index.html") 39 + let normalized_path = normalize_path(&full_path); 40 + 41 + blob_map.insert( 42 + normalized_path.clone(), 43 + (blob_ref.clone().into_static(), cid_string.clone()) 44 + ); 45 + 46 + // Also store the full path for matching 47 + if normalized_path != full_path { 48 + blob_map.insert( 49 + full_path, 50 + (blob_ref.clone().into_static(), cid_string) 51 + ); 52 + } 53 + } 54 + EntryNode::Directory(subdir) => { 55 + let sub_map = extract_blob_map_recursive(subdir, full_path); 56 + blob_map.extend(sub_map); 57 + } 58 + EntryNode::Unknown(_) => { 59 + // Skip unknown node types 60 + } 61 + } 62 + } 63 + 64 + blob_map 65 + } 66 + 67 + /// Normalize file path by removing base folder prefix 68 + /// Example: "cobblemon/index.html" -> "index.html" 69 + /// 70 + /// Mirrors TypeScript implementation at src/routes/wisp.ts line 291 71 + pub fn normalize_path(path: &str) -> String { 72 + // Remove base folder prefix (everything before first /) 73 + if let Some(idx) = path.find('/') { 74 + path[idx + 1..].to_string() 75 + } else { 76 + path.to_string() 77 + } 78 + } 79 + 80 + #[cfg(test)] 81 + mod tests { 82 + use super::*; 83 + 84 + #[test] 85 + fn test_normalize_path() { 86 + assert_eq!(normalize_path("index.html"), "index.html"); 87 + assert_eq!(normalize_path("cobblemon/index.html"), "index.html"); 88 + assert_eq!(normalize_path("folder/subfolder/file.txt"), "subfolder/file.txt"); 89 + assert_eq!(normalize_path("a/b/c/d.txt"), "b/c/d.txt"); 90 + } 91 + } 92 +
+66
cli/src/cid.rs
··· 1 + use jacquard_common::types::cid::IpldCid; 2 + use sha2::{Digest, Sha256}; 3 + 4 + /// Compute CID (Content Identifier) for blob content 5 + /// Uses the same algorithm as AT Protocol: CIDv1 with raw codec (0x55) and SHA-256 6 + /// 7 + /// CRITICAL: This must be called on BASE64-ENCODED GZIPPED content, not just gzipped content 8 + /// 9 + /// Based on @atproto/common/src/ipld.ts sha256RawToCid implementation 10 + pub fn compute_cid(content: &[u8]) -> String { 11 + // Use node crypto to compute sha256 hash (same as AT Protocol) 12 + let hash = Sha256::digest(content); 13 + 14 + // Create multihash (code 0x12 = sha2-256) 15 + let multihash = multihash::Multihash::wrap(0x12, &hash) 16 + .expect("SHA-256 hash should always fit in multihash"); 17 + 18 + // Create CIDv1 with raw codec (0x55) 19 + let cid = IpldCid::new_v1(0x55, multihash); 20 + 21 + // Convert to base32 string representation 22 + cid.to_string_of_base(multibase::Base::Base32Lower) 23 + .unwrap_or_else(|_| cid.to_string()) 24 + } 25 + 26 + #[cfg(test)] 27 + mod tests { 28 + use super::*; 29 + use base64::Engine; 30 + 31 + #[test] 32 + fn test_compute_cid() { 33 + // Test with a simple string: "hello" 34 + let content = b"hello"; 35 + let cid = compute_cid(content); 36 + 37 + // CID should start with 'baf' for raw codec base32 38 + assert!(cid.starts_with("baf")); 39 + } 40 + 41 + #[test] 42 + fn test_compute_cid_base64_encoded() { 43 + // Simulate the actual use case: gzipped then base64 encoded 44 + use flate2::write::GzEncoder; 45 + use flate2::Compression; 46 + use std::io::Write; 47 + 48 + let original = b"hello world"; 49 + 50 + // Gzip compress 51 + let mut encoder = GzEncoder::new(Vec::new(), Compression::default()); 52 + encoder.write_all(original).unwrap(); 53 + let gzipped = encoder.finish().unwrap(); 54 + 55 + // Base64 encode the gzipped data 56 + let base64_bytes = base64::prelude::BASE64_STANDARD.encode(&gzipped).into_bytes(); 57 + 58 + // Compute CID on the base64 bytes 59 + let cid = compute_cid(&base64_bytes); 60 + 61 + // Should be a valid CID 62 + assert!(cid.starts_with("baf")); 63 + assert!(cid.len() > 10); 64 + } 65 + } 66 +
+121 -38
cli/src/main.rs
··· 1 1 mod builder_types; 2 2 mod place_wisp; 3 + mod cid; 4 + mod blob_map; 3 5 4 6 use clap::Parser; 5 7 use jacquard::CowStr; 6 - use jacquard::client::{Agent, FileAuthStore, AgentSessionExt, MemoryCredentialSession}; 8 + use jacquard::client::{Agent, FileAuthStore, AgentSessionExt, MemoryCredentialSession, AgentSession}; 7 9 use jacquard::oauth::client::OAuthClient; 8 10 use jacquard::oauth::loopback::LoopbackConfig; 9 11 use jacquard::prelude::IdentityResolver; ··· 11 13 use jacquard_common::types::blob::MimeType; 12 14 use miette::IntoDiagnostic; 13 15 use std::path::{Path, PathBuf}; 16 + use std::collections::HashMap; 14 17 use flate2::Compression; 15 18 use flate2::write::GzEncoder; 16 19 use std::io::Write; ··· 107 110 108 111 println!("Deploying site '{}'...", site_name); 109 112 110 - // Build directory tree 111 - let root_dir = build_directory(agent, &path).await?; 113 + // Try to fetch existing manifest for incremental updates 114 + let existing_blob_map: HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)> = { 115 + use jacquard_common::types::string::AtUri; 116 + 117 + // Get the DID for this session 118 + let session_info = agent.session_info().await; 119 + if let Some((did, _)) = session_info { 120 + // Construct the AT URI for the record 121 + let uri_string = format!("at://{}/place.wisp.fs/{}", did, site_name); 122 + if let Ok(uri) = AtUri::new(&uri_string) { 123 + match agent.get_record::<Fs>(&uri).await { 124 + Ok(response) => { 125 + match response.into_output() { 126 + Ok(record_output) => { 127 + let existing_manifest = record_output.value; 128 + let blob_map = blob_map::extract_blob_map(&existing_manifest.root); 129 + println!("Found existing manifest with {} files, checking for changes...", blob_map.len()); 130 + blob_map 131 + } 132 + Err(_) => { 133 + println!("No existing manifest found, uploading all files..."); 134 + HashMap::new() 135 + } 136 + } 137 + } 138 + Err(_) => { 139 + // Record doesn't exist yet - this is a new site 140 + println!("No existing manifest found, uploading all files..."); 141 + HashMap::new() 142 + } 143 + } 144 + } else { 145 + println!("No existing manifest found (invalid URI), uploading all files..."); 146 + HashMap::new() 147 + } 148 + } else { 149 + println!("No existing manifest found (could not get DID), uploading all files..."); 150 + HashMap::new() 151 + } 152 + }; 112 153 113 - // Count total files 114 - let file_count = count_files(&root_dir); 154 + // Build directory tree 155 + let (root_dir, total_files, reused_count) = build_directory(agent, &path, &existing_blob_map).await?; 156 + let uploaded_count = total_files - reused_count; 115 157 116 158 // Create the Fs record 117 159 let fs_record = Fs::new() 118 160 .site(CowStr::from(site_name.clone())) 119 161 .root(root_dir) 120 - .file_count(file_count as i64) 162 + .file_count(total_files as i64) 121 163 .created_at(Datetime::now()) 122 164 .build(); 123 165 ··· 132 174 .and_then(|s| s.split('/').next()) 133 175 .ok_or_else(|| miette::miette!("Failed to parse DID from URI"))?; 134 176 135 - println!("Deployed site '{}': {}", site_name, output.uri); 136 - println!("Available at: https://sites.wisp.place/{}/{}", did, site_name); 177 + println!("\n✓ Deployed site '{}': {}", site_name, output.uri); 178 + println!(" Total files: {} ({} reused, {} uploaded)", total_files, reused_count, uploaded_count); 179 + println!(" Available at: https://sites.wisp.place/{}/{}", did, site_name); 137 180 138 181 Ok(()) 139 182 } ··· 142 185 fn build_directory<'a>( 143 186 agent: &'a Agent<impl jacquard::client::AgentSession + IdentityResolver + 'a>, 144 187 dir_path: &'a Path, 145 - ) -> std::pin::Pin<Box<dyn std::future::Future<Output = miette::Result<Directory<'static>>> + 'a>> 188 + existing_blobs: &'a HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)>, 189 + ) -> std::pin::Pin<Box<dyn std::future::Future<Output = miette::Result<(Directory<'static>, usize, usize)>> + 'a>> 146 190 { 147 191 Box::pin(async move { 148 192 // Collect all directory entries first ··· 177 221 } 178 222 179 223 // Process files concurrently with a limit of 5 180 - let file_entries: Vec<Entry> = stream::iter(file_tasks) 224 + let file_results: Vec<(Entry<'static>, bool)> = stream::iter(file_tasks) 181 225 .map(|(name, path)| async move { 182 - let file_node = process_file(agent, &path).await?; 183 - Ok::<_, miette::Report>(Entry::new() 226 + let (file_node, reused) = process_file(agent, &path, &name, existing_blobs).await?; 227 + let entry = Entry::new() 184 228 .name(CowStr::from(name)) 185 229 .node(EntryNode::File(Box::new(file_node))) 186 - .build()) 230 + .build(); 231 + Ok::<_, miette::Report>((entry, reused)) 187 232 }) 188 233 .buffer_unordered(5) 189 234 .collect::<Vec<_>>() 190 235 .await 191 236 .into_iter() 192 237 .collect::<miette::Result<Vec<_>>>()?; 238 + 239 + let mut file_entries = Vec::new(); 240 + let mut reused_count = 0; 241 + let mut total_files = 0; 242 + 243 + for (entry, reused) in file_results { 244 + file_entries.push(entry); 245 + total_files += 1; 246 + if reused { 247 + reused_count += 1; 248 + } 249 + } 193 250 194 251 // Process directories recursively (sequentially to avoid too much nesting) 195 252 let mut dir_entries = Vec::new(); 196 253 for (name, path) in dir_tasks { 197 - let subdir = build_directory(agent, &path).await?; 254 + let (subdir, sub_total, sub_reused) = build_directory(agent, &path, existing_blobs).await?; 198 255 dir_entries.push(Entry::new() 199 256 .name(CowStr::from(name)) 200 257 .node(EntryNode::Directory(Box::new(subdir))) 201 258 .build()); 259 + total_files += sub_total; 260 + reused_count += sub_reused; 202 261 } 203 262 204 263 // Combine file and directory entries 205 264 let mut entries = file_entries; 206 265 entries.extend(dir_entries); 207 266 208 - Ok(Directory::new() 267 + let directory = Directory::new() 209 268 .r#type(CowStr::from("directory")) 210 269 .entries(entries) 211 - .build()) 270 + .build(); 271 + 272 + Ok((directory, total_files, reused_count)) 212 273 }) 213 274 } 214 275 215 - /// Process a single file: gzip -> base64 -> upload blob 276 + /// Process a single file: gzip -> base64 -> upload blob (or reuse existing) 277 + /// Returns (File, reused: bool) 216 278 async fn process_file( 217 279 agent: &Agent<impl jacquard::client::AgentSession + IdentityResolver>, 218 280 file_path: &Path, 219 - ) -> miette::Result<File<'static>> 281 + file_name: &str, 282 + existing_blobs: &HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)>, 283 + ) -> miette::Result<(File<'static>, bool)> 220 284 { 221 285 // Read file 222 286 let file_data = std::fs::read(file_path).into_diagnostic()?; ··· 234 298 // Base64 encode the gzipped data 235 299 let base64_bytes = base64::prelude::BASE64_STANDARD.encode(&gzipped).into_bytes(); 236 300 237 - // Upload blob as octet-stream 301 + // Compute CID for this file (CRITICAL: on base64-encoded gzipped content) 302 + let file_cid = cid::compute_cid(&base64_bytes); 303 + 304 + // Normalize the file path for comparison 305 + let normalized_path = blob_map::normalize_path(file_name); 306 + 307 + // Check if we have an existing blob with the same CID 308 + let existing_blob = existing_blobs.get(&normalized_path) 309 + .or_else(|| existing_blobs.get(file_name)); 310 + 311 + if let Some((existing_blob_ref, existing_cid)) = existing_blob { 312 + if existing_cid == &file_cid { 313 + // CIDs match - reuse existing blob 314 + println!(" ✓ Reusing blob for {} (CID: {})", file_name, file_cid); 315 + return Ok(( 316 + File::new() 317 + .r#type(CowStr::from("file")) 318 + .blob(existing_blob_ref.clone()) 319 + .encoding(CowStr::from("gzip")) 320 + .mime_type(CowStr::from(original_mime)) 321 + .base64(true) 322 + .build(), 323 + true 324 + )); 325 + } 326 + } 327 + 328 + // File is new or changed - upload it 329 + println!(" ↑ Uploading {} ({} bytes, CID: {})", file_name, base64_bytes.len(), file_cid); 238 330 let blob = agent.upload_blob( 239 331 base64_bytes, 240 332 MimeType::new_static("application/octet-stream"), 241 333 ).await?; 242 334 243 - Ok(File::new() 244 - .r#type(CowStr::from("file")) 245 - .blob(blob) 246 - .encoding(CowStr::from("gzip")) 247 - .mime_type(CowStr::from(original_mime)) 248 - .base64(true) 249 - .build()) 335 + Ok(( 336 + File::new() 337 + .r#type(CowStr::from("file")) 338 + .blob(blob) 339 + .encoding(CowStr::from("gzip")) 340 + .mime_type(CowStr::from(original_mime)) 341 + .base64(true) 342 + .build(), 343 + false 344 + )) 250 345 } 251 346 252 - /// Count total files in a directory tree 253 - fn count_files(dir: &Directory) -> usize { 254 - let mut count = 0; 255 - for entry in &dir.entries { 256 - match &entry.node { 257 - EntryNode::File(_) => count += 1, 258 - EntryNode::Directory(subdir) => count += count_files(subdir), 259 - _ => {} // Unknown variants 260 - } 261 - } 262 - count 263 - }