back interdiff of round #1 and #0

rudimentry _redirects support, incremental uploading for cli #3

closed
opened by nekomimi.pet targeting main

TODO _headers file place.wisp.settings lexicon as a lexiconal way of configuring this

REVERTED
README.md
··· 50 50 cargo build 51 51 ``` 52 52 53 - ## Features 54 - 55 - ### URL Redirects and Rewrites 56 - 57 - The hosting service supports Netlify-style `_redirects` files for managing URLs. Place a `_redirects` file in your site root to enable: 58 - 59 - - **301/302 Redirects**: Permanent and temporary URL redirects 60 - - **200 Rewrites**: Serve different content without changing the URL 61 - - **404 Custom Pages**: Custom error pages for specific paths 62 - - **Splats & Placeholders**: Dynamic path matching (`/blog/:year/:month/:day`, `/news/*`) 63 - - **Query Parameter Matching**: Redirect based on URL parameters 64 - - **Conditional Redirects**: Route by country, language, or cookie presence 65 - - **Force Redirects**: Override existing files with redirects 66 - 67 - Example `_redirects`: 68 - ``` 69 - # Single-page app routing (React, Vue, etc.) 70 - /* /index.html 200 71 - 72 - # Simple redirects 73 - /home / 74 - /old-blog/* /blog/:splat 75 - 76 - # API proxy 77 - /api/* https://api.example.com/:splat 200 78 - 79 - # Country-based routing 80 - / /us/ 302 Country=us 81 - / /uk/ 302 Country=gb 82 - ``` 83 - 84 53 ## Limits 85 54 86 55 - Max file size: 100MB (PDS limit) 56 + - Max site size: 300MB 87 57 - Max files: 2000 88 58 89 59 ## Tech Stack
REVERTED
hosting-service/EXAMPLE.md
··· 1 + # HTML Path Rewriting Example 2 + 3 + This document demonstrates how HTML path rewriting works when serving sites via the `/s/:identifier/:site/*` route. 4 + 5 + ## Problem 6 + 7 + When you create a static site with absolute paths like `/style.css` or `/images/logo.png`, these paths work fine when served from the root domain. However, when served from a subdirectory like `/s/alice.bsky.social/mysite/`, these absolute paths break because they resolve to the server root instead of the site root. 8 + 9 + ## Solution 10 + 11 + The hosting service automatically rewrites absolute paths in HTML files to work correctly in the subdirectory context. 12 + 13 + ## Example 14 + 15 + **Original HTML file (index.html):** 16 + ```html 17 + <!DOCTYPE html> 18 + <html> 19 + <head> 20 + <meta charset="UTF-8"> 21 + <title>My Site</title> 22 + <link rel="stylesheet" href="/style.css"> 23 + <link rel="icon" href="/favicon.ico"> 24 + <script src="/app.js"></script> 25 + </head> 26 + <body> 27 + <header> 28 + <img src="/images/logo.png" alt="Logo"> 29 + <nav> 30 + <a href="/">Home</a> 31 + <a href="/about">About</a> 32 + <a href="/contact">Contact</a> 33 + </nav> 34 + </header> 35 + 36 + <main> 37 + <h1>Welcome</h1> 38 + <img src="/images/hero.jpg" 39 + srcset="/images/hero.jpg 1x, /images/hero@2x.jpg 2x" 40 + alt="Hero"> 41 + 42 + <form action="/submit" method="post"> 43 + <input type="text" name="email"> 44 + <button>Submit</button> 45 + </form> 46 + </main> 47 + 48 + <footer> 49 + <a href="https://example.com">External Link</a> 50 + <a href="#top">Back to Top</a> 51 + </footer> 52 + </body> 53 + </html> 54 + ``` 55 + 56 + **When accessed via `/s/alice.bsky.social/mysite/`, the HTML is rewritten to:** 57 + ```html 58 + <!DOCTYPE html> 59 + <html> 60 + <head> 61 + <meta charset="UTF-8"> 62 + <title>My Site</title> 63 + <link rel="stylesheet" href="/s/alice.bsky.social/mysite/style.css"> 64 + <link rel="icon" href="/s/alice.bsky.social/mysite/favicon.ico"> 65 + <script src="/s/alice.bsky.social/mysite/app.js"></script> 66 + </head> 67 + <body> 68 + <header> 69 + <img src="/s/alice.bsky.social/mysite/images/logo.png" alt="Logo"> 70 + <nav> 71 + <a href="/s/alice.bsky.social/mysite/">Home</a> 72 + <a href="/s/alice.bsky.social/mysite/about">About</a> 73 + <a href="/s/alice.bsky.social/mysite/contact">Contact</a> 74 + </nav> 75 + </header> 76 + 77 + <main> 78 + <h1>Welcome</h1> 79 + <img src="/s/alice.bsky.social/mysite/images/hero.jpg" 80 + srcset="/s/alice.bsky.social/mysite/images/hero.jpg 1x, /s/alice.bsky.social/mysite/images/hero@2x.jpg 2x" 81 + alt="Hero"> 82 + 83 + <form action="/s/alice.bsky.social/mysite/submit" method="post"> 84 + <input type="text" name="email"> 85 + <button>Submit</button> 86 + </form> 87 + </main> 88 + 89 + <footer> 90 + <a href="https://example.com">External Link</a> 91 + <a href="#top">Back to Top</a> 92 + </footer> 93 + </body> 94 + </html> 95 + ``` 96 + 97 + ## What's Preserved 98 + 99 + Notice that: 100 + - ✅ Absolute paths are rewritten: `/style.css` → `/s/alice.bsky.social/mysite/style.css` 101 + - ✅ External URLs are preserved: `https://example.com` stays the same 102 + - ✅ Anchors are preserved: `#top` stays the same 103 + - ✅ The rewriting is safe and won't break your site 104 + 105 + ## Supported Attributes 106 + 107 + The rewriter handles these HTML attributes: 108 + - `src` - images, scripts, iframes, videos, audio 109 + - `href` - links, stylesheets 110 + - `action` - forms 111 + - `data` - objects 112 + - `poster` - video posters 113 + - `srcset` - responsive images 114 + 115 + ## Testing Your Site 116 + 117 + To test if your site works with path rewriting: 118 + 119 + 1. Upload your site to your PDS as a `place.wisp.fs` record 120 + 2. Access it via: `https://hosting.wisp.place/s/YOUR_HANDLE/SITE_NAME/` 121 + 3. Check that all resources load correctly 122 + 123 + If you're using relative paths already (like `./style.css` or `../images/logo.png`), they'll work without any rewriting.
REVERTED
hosting-service/example-_redirects
··· 1 - # Example _redirects file for Wisp hosting 2 - # Place this file in the root directory of your site as "_redirects" 3 - # Lines starting with # are comments 4 - 5 - # =================================== 6 - # SIMPLE REDIRECTS 7 - # =================================== 8 - 9 - # Redirect home page 10 - # /home / 11 - 12 - # Redirect old URLs to new ones 13 - # /old-blog /blog 14 - # /about-us /about 15 - 16 - # =================================== 17 - # SPLAT REDIRECTS (WILDCARDS) 18 - # =================================== 19 - 20 - # Redirect entire directories 21 - # /news/* /blog/:splat 22 - # /old-site/* /new-site/:splat 23 - 24 - # =================================== 25 - # PLACEHOLDER REDIRECTS 26 - # =================================== 27 - 28 - # Restructure blog URLs 29 - # /blog/:year/:month/:day/:slug /posts/:year-:month-:day/:slug 30 - 31 - # Capture multiple parameters 32 - # /products/:category/:id /shop/:category/item/:id 33 - 34 - # =================================== 35 - # STATUS CODES 36 - # =================================== 37 - 38 - # Permanent redirect (301) - default if not specified 39 - # /permanent-move /new-location 301 40 - 41 - # Temporary redirect (302) 42 - # /temp-redirect /temp-location 302 43 - 44 - # Rewrite (200) - serves different content, URL stays the same 45 - # /api/* /functions/:splat 200 46 - 47 - # Custom 404 page 48 - # /shop/* /shop-closed.html 404 49 - 50 - # =================================== 51 - # FORCE REDIRECTS 52 - # =================================== 53 - 54 - # Force redirect even if file exists (note the ! after status code) 55 - # /override-file /other-file.html 200! 56 - 57 - # =================================== 58 - # CONDITIONAL REDIRECTS 59 - # =================================== 60 - 61 - # Country-based redirects (ISO 3166-1 alpha-2 codes) 62 - # / /us/ 302 Country=us 63 - # / /uk/ 302 Country=gb 64 - # / /anz/ 302 Country=au,nz 65 - 66 - # Language-based redirects 67 - # /products /en/products 301 Language=en 68 - # /products /de/products 301 Language=de 69 - # /products /fr/products 301 Language=fr 70 - 71 - # Cookie-based redirects (checks if cookie exists) 72 - # /* /legacy/:splat 200 Cookie=is_legacy 73 - 74 - # =================================== 75 - # QUERY PARAMETERS 76 - # =================================== 77 - 78 - # Match specific query parameters 79 - # /store id=:id /blog/:id 301 80 - 81 - # Multiple parameters 82 - # /search q=:query category=:cat /find/:cat/:query 301 83 - 84 - # =================================== 85 - # DOMAIN-LEVEL REDIRECTS 86 - # =================================== 87 - 88 - # Redirect to different domain (must include protocol) 89 - # /external https://example.com/path 90 - 91 - # Redirect entire subdomain 92 - # http://blog.example.com/* https://example.com/blog/:splat 301! 93 - # https://blog.example.com/* https://example.com/blog/:splat 301! 94 - 95 - # =================================== 96 - # COMMON PATTERNS 97 - # =================================== 98 - 99 - # Remove .html extensions 100 - # /page.html /page 101 - 102 - # Add trailing slash 103 - # /about /about/ 104 - 105 - # Single-page app fallback (serve index.html for all paths) 106 - # /* /index.html 200 107 - 108 - # API proxy 109 - # /api/* https://api.example.com/:splat 200 110 - 111 - # =================================== 112 - # CUSTOM ERROR PAGES 113 - # =================================== 114 - 115 - # Language-specific 404 pages 116 - # /en/* /en/404.html 404 117 - # /de/* /de/404.html 404 118 - 119 - # Section-specific 404 pages 120 - # /shop/* /shop/not-found.html 404 121 - # /blog/* /blog/404.html 404 122 - 123 - # =================================== 124 - # NOTES 125 - # =================================== 126 - # 127 - # - Rules are processed in order (first match wins) 128 - # - More specific rules should come before general ones 129 - # - Splats (*) can only be used at the end of a path 130 - # - Query parameters are automatically preserved for 200, 301, 302 131 - # - Trailing slashes are normalized (/ and no / are treated the same) 132 - # - Default status code is 301 if not specified 133 - # 134 -
REVERTED
hosting-service/src/lib/redirects.test.ts
··· 1 - import { describe, it, expect } from 'bun:test' 2 - import { parseRedirectsFile, matchRedirectRule } from './redirects'; 3 - 4 - describe('parseRedirectsFile', () => { 5 - it('should parse simple redirects', () => { 6 - const content = ` 7 - # Comment line 8 - /old-path /new-path 9 - /home / 301 10 - `; 11 - const rules = parseRedirectsFile(content); 12 - expect(rules).toHaveLength(2); 13 - expect(rules[0]).toMatchObject({ 14 - from: '/old-path', 15 - to: '/new-path', 16 - status: 301, 17 - force: false, 18 - }); 19 - expect(rules[1]).toMatchObject({ 20 - from: '/home', 21 - to: '/', 22 - status: 301, 23 - force: false, 24 - }); 25 - }); 26 - 27 - it('should parse redirects with different status codes', () => { 28 - const content = ` 29 - /temp-redirect /target 302 30 - /rewrite /content 200 31 - /not-found /404 404 32 - `; 33 - const rules = parseRedirectsFile(content); 34 - expect(rules).toHaveLength(3); 35 - expect(rules[0]?.status).toBe(302); 36 - expect(rules[1]?.status).toBe(200); 37 - expect(rules[2]?.status).toBe(404); 38 - }); 39 - 40 - it('should parse force redirects', () => { 41 - const content = `/force-path /target 301!`; 42 - const rules = parseRedirectsFile(content); 43 - expect(rules[0]?.force).toBe(true); 44 - expect(rules[0]?.status).toBe(301); 45 - }); 46 - 47 - it('should parse splat redirects', () => { 48 - const content = `/news/* /blog/:splat`; 49 - const rules = parseRedirectsFile(content); 50 - expect(rules[0]?.from).toBe('/news/*'); 51 - expect(rules[0]?.to).toBe('/blog/:splat'); 52 - }); 53 - 54 - it('should parse placeholder redirects', () => { 55 - const content = `/blog/:year/:month/:day /posts/:year-:month-:day`; 56 - const rules = parseRedirectsFile(content); 57 - expect(rules[0]?.from).toBe('/blog/:year/:month/:day'); 58 - expect(rules[0]?.to).toBe('/posts/:year-:month-:day'); 59 - }); 60 - 61 - it('should parse country-based redirects', () => { 62 - const content = `/ /anz 302 Country=au,nz`; 63 - const rules = parseRedirectsFile(content); 64 - expect(rules[0]?.conditions?.country).toEqual(['au', 'nz']); 65 - }); 66 - 67 - it('should parse language-based redirects', () => { 68 - const content = `/products /en/products 301 Language=en`; 69 - const rules = parseRedirectsFile(content); 70 - expect(rules[0]?.conditions?.language).toEqual(['en']); 71 - }); 72 - 73 - it('should parse cookie-based redirects', () => { 74 - const content = `/* /legacy/:splat 200 Cookie=is_legacy,my_cookie`; 75 - const rules = parseRedirectsFile(content); 76 - expect(rules[0]?.conditions?.cookie).toEqual(['is_legacy', 'my_cookie']); 77 - }); 78 - }); 79 - 80 - describe('matchRedirectRule', () => { 81 - it('should match exact paths', () => { 82 - const rules = parseRedirectsFile('/old-path /new-path'); 83 - const match = matchRedirectRule('/old-path', rules); 84 - expect(match).toBeTruthy(); 85 - expect(match?.targetPath).toBe('/new-path'); 86 - expect(match?.status).toBe(301); 87 - }); 88 - 89 - it('should match paths with trailing slash', () => { 90 - const rules = parseRedirectsFile('/old-path /new-path'); 91 - const match = matchRedirectRule('/old-path/', rules); 92 - expect(match).toBeTruthy(); 93 - expect(match?.targetPath).toBe('/new-path'); 94 - }); 95 - 96 - it('should match splat patterns', () => { 97 - const rules = parseRedirectsFile('/news/* /blog/:splat'); 98 - const match = matchRedirectRule('/news/2024/01/15/my-post', rules); 99 - expect(match).toBeTruthy(); 100 - expect(match?.targetPath).toBe('/blog/2024/01/15/my-post'); 101 - }); 102 - 103 - it('should match placeholder patterns', () => { 104 - const rules = parseRedirectsFile('/blog/:year/:month/:day /posts/:year-:month-:day'); 105 - const match = matchRedirectRule('/blog/2024/01/15', rules); 106 - expect(match).toBeTruthy(); 107 - expect(match?.targetPath).toBe('/posts/2024-01-15'); 108 - }); 109 - 110 - it('should preserve query strings for 301/302 redirects', () => { 111 - const rules = parseRedirectsFile('/old /new 301'); 112 - const match = matchRedirectRule('/old', rules, { 113 - queryParams: { foo: 'bar', baz: 'qux' }, 114 - }); 115 - expect(match?.targetPath).toContain('?'); 116 - expect(match?.targetPath).toContain('foo=bar'); 117 - expect(match?.targetPath).toContain('baz=qux'); 118 - }); 119 - 120 - it('should match based on query parameters', () => { 121 - const rules = parseRedirectsFile('/store id=:id /blog/:id 301'); 122 - const match = matchRedirectRule('/store', rules, { 123 - queryParams: { id: 'my-post' }, 124 - }); 125 - expect(match).toBeTruthy(); 126 - expect(match?.targetPath).toContain('/blog/my-post'); 127 - }); 128 - 129 - it('should not match when query params are missing', () => { 130 - const rules = parseRedirectsFile('/store id=:id /blog/:id 301'); 131 - const match = matchRedirectRule('/store', rules, { 132 - queryParams: {}, 133 - }); 134 - expect(match).toBeNull(); 135 - }); 136 - 137 - it('should match based on country header', () => { 138 - const rules = parseRedirectsFile('/ /aus 302 Country=au'); 139 - const match = matchRedirectRule('/', rules, { 140 - headers: { 'cf-ipcountry': 'AU' }, 141 - }); 142 - expect(match).toBeTruthy(); 143 - expect(match?.targetPath).toBe('/aus'); 144 - }); 145 - 146 - it('should not match wrong country', () => { 147 - const rules = parseRedirectsFile('/ /aus 302 Country=au'); 148 - const match = matchRedirectRule('/', rules, { 149 - headers: { 'cf-ipcountry': 'US' }, 150 - }); 151 - expect(match).toBeNull(); 152 - }); 153 - 154 - it('should match based on language header', () => { 155 - const rules = parseRedirectsFile('/products /en/products 301 Language=en'); 156 - const match = matchRedirectRule('/products', rules, { 157 - headers: { 'accept-language': 'en-US,en;q=0.9' }, 158 - }); 159 - expect(match).toBeTruthy(); 160 - expect(match?.targetPath).toBe('/en/products'); 161 - }); 162 - 163 - it('should match based on cookie presence', () => { 164 - const rules = parseRedirectsFile('/* /legacy/:splat 200 Cookie=is_legacy'); 165 - const match = matchRedirectRule('/some-path', rules, { 166 - cookies: { is_legacy: 'true' }, 167 - }); 168 - expect(match).toBeTruthy(); 169 - expect(match?.targetPath).toBe('/legacy/some-path'); 170 - }); 171 - 172 - it('should return first matching rule', () => { 173 - const content = ` 174 - /path /first 175 - /path /second 176 - `; 177 - const rules = parseRedirectsFile(content); 178 - const match = matchRedirectRule('/path', rules); 179 - expect(match?.targetPath).toBe('/first'); 180 - }); 181 - 182 - it('should match more specific rules before general ones', () => { 183 - const content = ` 184 - /jobs/customer-ninja /careers/support 185 - /jobs/* /careers/:splat 186 - `; 187 - const rules = parseRedirectsFile(content); 188 - 189 - const match1 = matchRedirectRule('/jobs/customer-ninja', rules); 190 - expect(match1?.targetPath).toBe('/careers/support'); 191 - 192 - const match2 = matchRedirectRule('/jobs/developer', rules); 193 - expect(match2?.targetPath).toBe('/careers/developer'); 194 - }); 195 - 196 - it('should handle SPA routing pattern', () => { 197 - const rules = parseRedirectsFile('/* /index.html 200'); 198 - 199 - // Should match any path 200 - const match1 = matchRedirectRule('/about', rules); 201 - expect(match1).toBeTruthy(); 202 - expect(match1?.targetPath).toBe('/index.html'); 203 - expect(match1?.status).toBe(200); 204 - 205 - const match2 = matchRedirectRule('/users/123/profile', rules); 206 - expect(match2).toBeTruthy(); 207 - expect(match2?.targetPath).toBe('/index.html'); 208 - expect(match2?.status).toBe(200); 209 - 210 - const match3 = matchRedirectRule('/', rules); 211 - expect(match3).toBeTruthy(); 212 - expect(match3?.targetPath).toBe('/index.html'); 213 - }); 214 - }); 215 -
REVERTED
hosting-service/src/lib/redirects.ts
··· 1 - import { readFile } from 'fs/promises'; 2 - import { existsSync } from 'fs'; 3 - 4 - export interface RedirectRule { 5 - from: string; 6 - to: string; 7 - status: number; 8 - force: boolean; 9 - conditions?: { 10 - country?: string[]; 11 - language?: string[]; 12 - role?: string[]; 13 - cookie?: string[]; 14 - }; 15 - // For pattern matching 16 - fromPattern?: RegExp; 17 - fromParams?: string[]; // Named parameters from the pattern 18 - queryParams?: Record<string, string>; // Expected query parameters 19 - } 20 - 21 - export interface RedirectMatch { 22 - rule: RedirectRule; 23 - targetPath: string; 24 - status: number; 25 - } 26 - 27 - /** 28 - * Parse a _redirects file into an array of redirect rules 29 - */ 30 - export function parseRedirectsFile(content: string): RedirectRule[] { 31 - const lines = content.split('\n'); 32 - const rules: RedirectRule[] = []; 33 - 34 - for (let lineNum = 0; lineNum < lines.length; lineNum++) { 35 - const lineRaw = lines[lineNum]; 36 - if (!lineRaw) continue; 37 - 38 - const line = lineRaw.trim(); 39 - 40 - // Skip empty lines and comments 41 - if (!line || line.startsWith('#')) { 42 - continue; 43 - } 44 - 45 - try { 46 - const rule = parseRedirectLine(line); 47 - if (rule && rule.fromPattern) { 48 - rules.push(rule); 49 - } 50 - } catch (err) { 51 - console.warn(`Failed to parse redirect rule on line ${lineNum + 1}: ${line}`, err); 52 - } 53 - } 54 - 55 - return rules; 56 - } 57 - 58 - /** 59 - * Parse a single redirect rule line 60 - * Format: /from [query_params] /to [status] [conditions] 61 - */ 62 - function parseRedirectLine(line: string): RedirectRule | null { 63 - // Split by whitespace, but respect quoted strings (though not commonly used) 64 - const parts = line.split(/\s+/); 65 - 66 - if (parts.length < 2) { 67 - return null; 68 - } 69 - 70 - let idx = 0; 71 - const from = parts[idx++]; 72 - 73 - if (!from) { 74 - return null; 75 - } 76 - 77 - let status = 301; // Default status 78 - let force = false; 79 - const conditions: NonNullable<RedirectRule['conditions']> = {}; 80 - const queryParams: Record<string, string> = {}; 81 - 82 - // Parse query parameters that come before the destination path 83 - // They look like: key=:value (and don't start with /) 84 - while (idx < parts.length) { 85 - const part = parts[idx]; 86 - if (!part) { 87 - idx++; 88 - continue; 89 - } 90 - 91 - // If it starts with / or http, it's the destination path 92 - if (part.startsWith('/') || part.startsWith('http://') || part.startsWith('https://')) { 93 - break; 94 - } 95 - 96 - // If it contains = and comes before the destination, it's a query param 97 - if (part.includes('=')) { 98 - const splitIndex = part.indexOf('='); 99 - const key = part.slice(0, splitIndex); 100 - const value = part.slice(splitIndex + 1); 101 - 102 - if (key && value) { 103 - queryParams[key] = value; 104 - } 105 - idx++; 106 - } else { 107 - // Not a query param, must be destination or something else 108 - break; 109 - } 110 - } 111 - 112 - // Next part should be the destination 113 - if (idx >= parts.length) { 114 - return null; 115 - } 116 - 117 - const to = parts[idx++]; 118 - if (!to) { 119 - return null; 120 - } 121 - 122 - // Parse remaining parts for status code and conditions 123 - for (let i = idx; i < parts.length; i++) { 124 - const part = parts[i]; 125 - 126 - if (!part) continue; 127 - 128 - // Check for status code (with optional ! for force) 129 - if (/^\d+!?$/.test(part)) { 130 - if (part.endsWith('!')) { 131 - force = true; 132 - status = parseInt(part.slice(0, -1)); 133 - } else { 134 - status = parseInt(part); 135 - } 136 - continue; 137 - } 138 - 139 - // Check for condition parameters (Country=, Language=, Role=, Cookie=) 140 - if (part.includes('=')) { 141 - const splitIndex = part.indexOf('='); 142 - const key = part.slice(0, splitIndex); 143 - const value = part.slice(splitIndex + 1); 144 - 145 - if (!key || !value) continue; 146 - 147 - const keyLower = key.toLowerCase(); 148 - 149 - if (keyLower === 'country') { 150 - conditions.country = value.split(',').map(v => v.trim().toLowerCase()); 151 - } else if (keyLower === 'language') { 152 - conditions.language = value.split(',').map(v => v.trim().toLowerCase()); 153 - } else if (keyLower === 'role') { 154 - conditions.role = value.split(',').map(v => v.trim()); 155 - } else if (keyLower === 'cookie') { 156 - conditions.cookie = value.split(',').map(v => v.trim().toLowerCase()); 157 - } 158 - } 159 - } 160 - 161 - // Parse the 'from' pattern 162 - const { pattern, params } = convertPathToRegex(from); 163 - 164 - return { 165 - from, 166 - to, 167 - status, 168 - force, 169 - conditions: Object.keys(conditions).length > 0 ? conditions : undefined, 170 - queryParams: Object.keys(queryParams).length > 0 ? queryParams : undefined, 171 - fromPattern: pattern, 172 - fromParams: params, 173 - }; 174 - } 175 - 176 - /** 177 - * Convert a path pattern with placeholders and splats to a regex 178 - * Examples: 179 - * /blog/:year/:month/:day -> captures year, month, day 180 - * /news/* -> captures splat 181 - */ 182 - function convertPathToRegex(pattern: string): { pattern: RegExp; params: string[] } { 183 - const params: string[] = []; 184 - let regexStr = '^'; 185 - 186 - // Split by query string if present 187 - const pathPart = pattern.split('?')[0] || pattern; 188 - 189 - // Escape special regex characters except * and : 190 - let escaped = pathPart.replace(/[.+^${}()|[\]\\]/g, '\\$&'); 191 - 192 - // Replace :param with named capture groups 193 - escaped = escaped.replace(/:([a-zA-Z_][a-zA-Z0-9_]*)/g, (match, paramName) => { 194 - params.push(paramName); 195 - // Match path segment (everything except / and ?) 196 - return '([^/?]+)'; 197 - }); 198 - 199 - // Replace * with splat capture (matches everything including /) 200 - if (escaped.includes('*')) { 201 - escaped = escaped.replace(/\*/g, '(.*)'); 202 - params.push('splat'); 203 - } 204 - 205 - regexStr += escaped; 206 - 207 - // Make trailing slash optional 208 - if (!regexStr.endsWith('.*')) { 209 - regexStr += '/?'; 210 - } 211 - 212 - regexStr += '$'; 213 - 214 - return { 215 - pattern: new RegExp(regexStr), 216 - params, 217 - }; 218 - } 219 - 220 - /** 221 - * Match a request path against redirect rules 222 - */ 223 - export function matchRedirectRule( 224 - requestPath: string, 225 - rules: RedirectRule[], 226 - context?: { 227 - queryParams?: Record<string, string>; 228 - headers?: Record<string, string>; 229 - cookies?: Record<string, string>; 230 - } 231 - ): RedirectMatch | null { 232 - // Normalize path: ensure leading slash, remove trailing slash (except for root) 233 - let normalizedPath = requestPath.startsWith('/') ? requestPath : `/${requestPath}`; 234 - 235 - for (const rule of rules) { 236 - // Check query parameter conditions first (if any) 237 - if (rule.queryParams) { 238 - // If rule requires query params but none provided, skip this rule 239 - if (!context?.queryParams) { 240 - continue; 241 - } 242 - 243 - const queryMatches = Object.entries(rule.queryParams).every(([key, value]) => { 244 - const actualValue = context.queryParams?.[key]; 245 - return actualValue !== undefined; 246 - }); 247 - 248 - if (!queryMatches) { 249 - continue; 250 - } 251 - } 252 - 253 - // Check conditional redirects (country, language, role, cookie) 254 - if (rule.conditions) { 255 - if (rule.conditions.country && context?.headers) { 256 - const cfCountry = context.headers['cf-ipcountry']; 257 - const xCountry = context.headers['x-country']; 258 - const country = (cfCountry?.toLowerCase() || xCountry?.toLowerCase()); 259 - if (!country || !rule.conditions.country.includes(country)) { 260 - continue; 261 - } 262 - } 263 - 264 - if (rule.conditions.language && context?.headers) { 265 - const acceptLang = context.headers['accept-language']; 266 - if (!acceptLang) { 267 - continue; 268 - } 269 - // Parse accept-language header (simplified) 270 - const langs = acceptLang.split(',').map(l => { 271 - const langPart = l.split(';')[0]; 272 - return langPart ? langPart.trim().toLowerCase() : ''; 273 - }).filter(l => l !== ''); 274 - const hasMatch = rule.conditions.language.some(lang => 275 - langs.some(l => l === lang || l.startsWith(lang + '-')) 276 - ); 277 - if (!hasMatch) { 278 - continue; 279 - } 280 - } 281 - 282 - if (rule.conditions.cookie && context?.cookies) { 283 - const hasCookie = rule.conditions.cookie.some(cookieName => 284 - context.cookies && cookieName in context.cookies 285 - ); 286 - if (!hasCookie) { 287 - continue; 288 - } 289 - } 290 - 291 - // Role-based redirects would need JWT verification - skip for now 292 - if (rule.conditions.role) { 293 - continue; 294 - } 295 - } 296 - 297 - // Match the path pattern 298 - const match = rule.fromPattern?.exec(normalizedPath); 299 - if (!match) { 300 - continue; 301 - } 302 - 303 - // Build the target path by replacing placeholders 304 - let targetPath = rule.to; 305 - 306 - // Replace captured parameters 307 - if (rule.fromParams && match.length > 1) { 308 - for (let i = 0; i < rule.fromParams.length; i++) { 309 - const paramName = rule.fromParams[i]; 310 - const paramValue = match[i + 1]; 311 - 312 - if (!paramName || !paramValue) continue; 313 - 314 - if (paramName === 'splat') { 315 - targetPath = targetPath.replace(':splat', paramValue); 316 - } else { 317 - targetPath = targetPath.replace(`:${paramName}`, paramValue); 318 - } 319 - } 320 - } 321 - 322 - // Handle query parameter replacements 323 - if (rule.queryParams && context?.queryParams) { 324 - for (const [key, placeholder] of Object.entries(rule.queryParams)) { 325 - const actualValue = context.queryParams[key]; 326 - if (actualValue && placeholder && placeholder.startsWith(':')) { 327 - const paramName = placeholder.slice(1); 328 - if (paramName) { 329 - targetPath = targetPath.replace(`:${paramName}`, actualValue); 330 - } 331 - } 332 - } 333 - } 334 - 335 - // Preserve query string for 200, 301, 302 redirects (unless target already has one) 336 - if ([200, 301, 302].includes(rule.status) && context?.queryParams && !targetPath.includes('?')) { 337 - const queryString = Object.entries(context.queryParams) 338 - .map(([k, v]) => `${encodeURIComponent(k)}=${encodeURIComponent(v)}`) 339 - .join('&'); 340 - if (queryString) { 341 - targetPath += `?${queryString}`; 342 - } 343 - } 344 - 345 - return { 346 - rule, 347 - targetPath, 348 - status: rule.status, 349 - }; 350 - } 351 - 352 - return null; 353 - } 354 - 355 - /** 356 - * Load redirect rules from a cached site 357 - */ 358 - export async function loadRedirectRules(did: string, rkey: string): Promise<RedirectRule[]> { 359 - const CACHE_DIR = process.env.CACHE_DIR || './cache/sites'; 360 - const redirectsPath = `${CACHE_DIR}/${did}/${rkey}/_redirects`; 361 - 362 - if (!existsSync(redirectsPath)) { 363 - return []; 364 - } 365 - 366 - try { 367 - const content = await readFile(redirectsPath, 'utf-8'); 368 - return parseRedirectsFile(content); 369 - } catch (err) { 370 - console.error('Failed to load _redirects file', err); 371 - return []; 372 - } 373 - } 374 - 375 - /** 376 - * Parse cookies from Cookie header 377 - */ 378 - export function parseCookies(cookieHeader?: string): Record<string, string> { 379 - if (!cookieHeader) return {}; 380 - 381 - const cookies: Record<string, string> = {}; 382 - const parts = cookieHeader.split(';'); 383 - 384 - for (const part of parts) { 385 - const [key, ...valueParts] = part.split('='); 386 - if (key && valueParts.length > 0) { 387 - cookies[key.trim()] = valueParts.join('=').trim(); 388 - } 389 - } 390 - 391 - return cookies; 392 - } 393 - 394 - /** 395 - * Parse query string into object 396 - */ 397 - export function parseQueryString(url: string): Record<string, string> { 398 - const queryStart = url.indexOf('?'); 399 - if (queryStart === -1) return {}; 400 - 401 - const queryString = url.slice(queryStart + 1); 402 - const params: Record<string, string> = {}; 403 - 404 - for (const pair of queryString.split('&')) { 405 - const [key, value] = pair.split('='); 406 - if (key) { 407 - params[decodeURIComponent(key)] = value ? decodeURIComponent(value) : ''; 408 - } 409 - } 410 - 411 - return params; 412 - } 413 -
REVERTED
hosting-service/src/server.ts
··· 7 7 import { lookup } from 'mime-types'; 8 8 import { logger, observabilityMiddleware, observabilityErrorHandler, logCollector, errorTracker, metricsCollector } from './lib/observability'; 9 9 import { fileCache, metadataCache, rewrittenHtmlCache, getCacheKey, type FileMetadata } from './lib/cache'; 10 - import { loadRedirectRules, matchRedirectRule, parseCookies, parseQueryString, type RedirectRule } from './lib/redirects'; 11 10 12 11 const BASE_HOST = process.env.BASE_HOST || 'wisp.place'; 13 12 ··· 36 35 } 37 36 } 38 37 39 - // Cache for redirect rules (per site) 40 - const redirectRulesCache = new Map<string, RedirectRule[]>(); 41 - 42 - /** 43 - * Clear redirect rules cache for a specific site 44 - * Should be called when a site is updated/recached 45 - */ 46 - export function clearRedirectRulesCache(did: string, rkey: string) { 47 - const cacheKey = `${did}:${rkey}`; 48 - redirectRulesCache.delete(cacheKey); 49 - } 50 - 51 38 // Helper to serve files from cache 39 + async function serveFromCache(did: string, rkey: string, filePath: string) { 52 - async function serveFromCache( 53 - did: string, 54 - rkey: string, 55 - filePath: string, 56 - fullUrl?: string, 57 - headers?: Record<string, string> 58 - ) { 59 - // Check for redirect rules first 60 - const redirectCacheKey = `${did}:${rkey}`; 61 - let redirectRules = redirectRulesCache.get(redirectCacheKey); 62 - 63 - if (redirectRules === undefined) { 64 - // Load rules for the first time 65 - redirectRules = await loadRedirectRules(did, rkey); 66 - redirectRulesCache.set(redirectCacheKey, redirectRules); 67 - } 68 - 69 - // Apply redirect rules if any exist 70 - if (redirectRules.length > 0) { 71 - const requestPath = '/' + (filePath || ''); 72 - const queryParams = fullUrl ? parseQueryString(fullUrl) : {}; 73 - const cookies = parseCookies(headers?.['cookie']); 74 - 75 - const redirectMatch = matchRedirectRule(requestPath, redirectRules, { 76 - queryParams, 77 - headers, 78 - cookies, 79 - }); 80 - 81 - if (redirectMatch) { 82 - const { targetPath, status } = redirectMatch; 83 - 84 - // Handle different status codes 85 - if (status === 200) { 86 - // Rewrite: serve different content but keep URL the same 87 - // Remove leading slash for internal path resolution 88 - const rewritePath = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 89 - return serveFileInternal(did, rkey, rewritePath); 90 - } else if (status === 301 || status === 302) { 91 - // External redirect: change the URL 92 - return new Response(null, { 93 - status, 94 - headers: { 95 - 'Location': targetPath, 96 - 'Cache-Control': status === 301 ? 'public, max-age=31536000' : 'public, max-age=0', 97 - }, 98 - }); 99 - } else if (status === 404) { 100 - // Custom 404 page 101 - const custom404Path = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 102 - const response = await serveFileInternal(did, rkey, custom404Path); 103 - // Override status to 404 104 - return new Response(response.body, { 105 - status: 404, 106 - headers: response.headers, 107 - }); 108 - } 109 - } 110 - } 111 - 112 - // No redirect matched, serve normally 113 - return serveFileInternal(did, rkey, filePath); 114 - } 115 - 116 - // Internal function to serve a file (used by both normal serving and rewrites) 117 - async function serveFileInternal(did: string, rkey: string, filePath: string) { 118 40 // Default to index.html if path is empty or ends with / 119 41 let requestPath = filePath || 'index.html'; 120 42 if (requestPath.endsWith('/')) { ··· 216 138 did: string, 217 139 rkey: string, 218 140 filePath: string, 141 + basePath: string 219 - basePath: string, 220 - fullUrl?: string, 221 - headers?: Record<string, string> 222 142 ) { 223 - // Check for redirect rules first 224 - const redirectCacheKey = `${did}:${rkey}`; 225 - let redirectRules = redirectRulesCache.get(redirectCacheKey); 226 - 227 - if (redirectRules === undefined) { 228 - // Load rules for the first time 229 - redirectRules = await loadRedirectRules(did, rkey); 230 - redirectRulesCache.set(redirectCacheKey, redirectRules); 231 - } 232 - 233 - // Apply redirect rules if any exist 234 - if (redirectRules.length > 0) { 235 - const requestPath = '/' + (filePath || ''); 236 - const queryParams = fullUrl ? parseQueryString(fullUrl) : {}; 237 - const cookies = parseCookies(headers?.['cookie']); 238 - 239 - const redirectMatch = matchRedirectRule(requestPath, redirectRules, { 240 - queryParams, 241 - headers, 242 - cookies, 243 - }); 244 - 245 - if (redirectMatch) { 246 - const { targetPath, status } = redirectMatch; 247 - 248 - // Handle different status codes 249 - if (status === 200) { 250 - // Rewrite: serve different content but keep URL the same 251 - const rewritePath = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 252 - return serveFileInternalWithRewrite(did, rkey, rewritePath, basePath); 253 - } else if (status === 301 || status === 302) { 254 - // External redirect: change the URL 255 - // For sites.wisp.place, we need to adjust the target path to include the base path 256 - // unless it's an absolute URL 257 - let redirectTarget = targetPath; 258 - if (!targetPath.startsWith('http://') && !targetPath.startsWith('https://')) { 259 - redirectTarget = basePath + (targetPath.startsWith('/') ? targetPath.slice(1) : targetPath); 260 - } 261 - return new Response(null, { 262 - status, 263 - headers: { 264 - 'Location': redirectTarget, 265 - 'Cache-Control': status === 301 ? 'public, max-age=31536000' : 'public, max-age=0', 266 - }, 267 - }); 268 - } else if (status === 404) { 269 - // Custom 404 page 270 - const custom404Path = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 271 - const response = await serveFileInternalWithRewrite(did, rkey, custom404Path, basePath); 272 - // Override status to 404 273 - return new Response(response.body, { 274 - status: 404, 275 - headers: response.headers, 276 - }); 277 - } 278 - } 279 - } 280 - 281 - // No redirect matched, serve normally 282 - return serveFileInternalWithRewrite(did, rkey, filePath, basePath); 283 - } 284 - 285 - // Internal function to serve a file with rewriting 286 - async function serveFileInternalWithRewrite(did: string, rkey: string, filePath: string, basePath: string) { 287 143 // Default to index.html if path is empty or ends with / 288 144 let requestPath = filePath || 'index.html'; 289 145 if (requestPath.endsWith('/')) { ··· 461 317 462 318 try { 463 319 await downloadAndCacheSite(did, rkey, siteData.record, pdsEndpoint, siteData.cid); 464 - // Clear redirect rules cache since the site was updated 465 - clearRedirectRulesCache(did, rkey); 466 320 logger.info('Site cached successfully', { did, rkey }); 467 321 return true; 468 322 } catch (err) { ··· 530 384 531 385 // Serve with HTML path rewriting to handle absolute paths 532 386 const basePath = `/${identifier}/${site}/`; 387 + return serveFromCacheWithRewrite(did, site, filePath, basePath); 533 - const headers: Record<string, string> = {}; 534 - c.req.raw.headers.forEach((value, key) => { 535 - headers[key.toLowerCase()] = value; 536 - }); 537 - return serveFromCacheWithRewrite(did, site, filePath, basePath, c.req.url, headers); 538 388 } 539 389 540 390 // Check if this is a DNS hash subdomain ··· 570 420 return c.text('Site not found', 404); 571 421 } 572 422 423 + return serveFromCache(customDomain.did, rkey, path); 573 - const headers: Record<string, string> = {}; 574 - c.req.raw.headers.forEach((value, key) => { 575 - headers[key.toLowerCase()] = value; 576 - }); 577 - return serveFromCache(customDomain.did, rkey, path, c.req.url, headers); 578 424 } 579 425 580 426 // Route 2: Registered subdomains - /*.wisp.place/* ··· 598 444 return c.text('Site not found', 404); 599 445 } 600 446 447 + return serveFromCache(domainInfo.did, rkey, path); 601 - const headers: Record<string, string> = {}; 602 - c.req.raw.headers.forEach((value, key) => { 603 - headers[key.toLowerCase()] = value; 604 - }); 605 - return serveFromCache(domainInfo.did, rkey, path, c.req.url, headers); 606 448 } 607 449 608 450 // Route 1: Custom domains - /* ··· 625 467 return c.text('Site not found', 404); 626 468 } 627 469 470 + return serveFromCache(customDomain.did, rkey, path); 628 - const headers: Record<string, string> = {}; 629 - c.req.raw.headers.forEach((value, key) => { 630 - headers[key.toLowerCase()] = value; 631 - }); 632 - return serveFromCache(customDomain.did, rkey, path, c.req.url, headers); 633 471 }); 634 472 635 473 // Internal observability endpoints (for admin panel)
ERROR
cli/.gitignore

Failed to calculate interdiff for this file.

ERROR
cli/Cargo.lock

Failed to calculate interdiff for this file.

ERROR
cli/Cargo.toml

Failed to calculate interdiff for this file.

REVERTED
cli/src/blob_map.rs
··· 1 - use jacquard_common::types::blob::BlobRef; 2 - use jacquard_common::IntoStatic; 3 - use std::collections::HashMap; 4 - 5 - use crate::place_wisp::fs::{Directory, EntryNode}; 6 - 7 - /// Extract blob information from a directory tree 8 - /// Returns a map of file paths to their blob refs and CIDs 9 - /// 10 - /// This mirrors the TypeScript implementation in src/lib/wisp-utils.ts lines 275-302 11 - pub fn extract_blob_map( 12 - directory: &Directory, 13 - ) -> HashMap<String, (BlobRef<'static>, String)> { 14 - extract_blob_map_recursive(directory, String::new()) 15 - } 16 - 17 - fn extract_blob_map_recursive( 18 - directory: &Directory, 19 - current_path: String, 20 - ) -> HashMap<String, (BlobRef<'static>, String)> { 21 - let mut blob_map = HashMap::new(); 22 - 23 - for entry in &directory.entries { 24 - let full_path = if current_path.is_empty() { 25 - entry.name.to_string() 26 - } else { 27 - format!("{}/{}", current_path, entry.name) 28 - }; 29 - 30 - match &entry.node { 31 - EntryNode::File(file_node) => { 32 - // Extract CID from blob ref 33 - // BlobRef is an enum with Blob variant, which has a ref field (CidLink) 34 - let blob_ref = &file_node.blob; 35 - let cid_string = blob_ref.blob().r#ref.to_string(); 36 - 37 - // Store both normalized and full paths 38 - // Normalize by removing base folder prefix (e.g., "cobblemon/index.html" -> "index.html") 39 - let normalized_path = normalize_path(&full_path); 40 - 41 - blob_map.insert( 42 - normalized_path.clone(), 43 - (blob_ref.clone().into_static(), cid_string.clone()) 44 - ); 45 - 46 - // Also store the full path for matching 47 - if normalized_path != full_path { 48 - blob_map.insert( 49 - full_path, 50 - (blob_ref.clone().into_static(), cid_string) 51 - ); 52 - } 53 - } 54 - EntryNode::Directory(subdir) => { 55 - let sub_map = extract_blob_map_recursive(subdir, full_path); 56 - blob_map.extend(sub_map); 57 - } 58 - EntryNode::Unknown(_) => { 59 - // Skip unknown node types 60 - } 61 - } 62 - } 63 - 64 - blob_map 65 - } 66 - 67 - /// Normalize file path by removing base folder prefix 68 - /// Example: "cobblemon/index.html" -> "index.html" 69 - /// 70 - /// Mirrors TypeScript implementation at src/routes/wisp.ts line 291 71 - pub fn normalize_path(path: &str) -> String { 72 - // Remove base folder prefix (everything before first /) 73 - if let Some(idx) = path.find('/') { 74 - path[idx + 1..].to_string() 75 - } else { 76 - path.to_string() 77 - } 78 - } 79 - 80 - #[cfg(test)] 81 - mod tests { 82 - use super::*; 83 - 84 - #[test] 85 - fn test_normalize_path() { 86 - assert_eq!(normalize_path("index.html"), "index.html"); 87 - assert_eq!(normalize_path("cobblemon/index.html"), "index.html"); 88 - assert_eq!(normalize_path("folder/subfolder/file.txt"), "subfolder/file.txt"); 89 - assert_eq!(normalize_path("a/b/c/d.txt"), "b/c/d.txt"); 90 - } 91 - } 92 -
ERROR
cli/src/cid.rs

Failed to calculate interdiff for this file.

REVERTED
cli/src/main.rs
··· 1 1 mod builder_types; 2 2 mod place_wisp; 3 - mod cid; 4 - mod blob_map; 5 3 6 4 use clap::Parser; 7 5 use jacquard::CowStr; 6 + use jacquard::client::{Agent, FileAuthStore, AgentSessionExt, MemoryCredentialSession}; 8 - use jacquard::client::{Agent, FileAuthStore, AgentSessionExt, MemoryCredentialSession, AgentSession}; 9 7 use jacquard::oauth::client::OAuthClient; 10 8 use jacquard::oauth::loopback::LoopbackConfig; 11 9 use jacquard::prelude::IdentityResolver; ··· 13 11 use jacquard_common::types::blob::MimeType; 14 12 use miette::IntoDiagnostic; 15 13 use std::path::{Path, PathBuf}; 16 - use std::collections::HashMap; 17 14 use flate2::Compression; 18 15 use flate2::write::GzEncoder; 19 16 use std::io::Write; ··· 110 107 111 108 println!("Deploying site '{}'...", site_name); 112 109 110 + // Build directory tree 111 + let root_dir = build_directory(agent, &path).await?; 113 - // Try to fetch existing manifest for incremental updates 114 - let existing_blob_map: HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)> = { 115 - use jacquard_common::types::string::AtUri; 116 - 117 - // Get the DID for this session 118 - let session_info = agent.session_info().await; 119 - if let Some((did, _)) = session_info { 120 - // Construct the AT URI for the record 121 - let uri_string = format!("at://{}/place.wisp.fs/{}", did, site_name); 122 - if let Ok(uri) = AtUri::new(&uri_string) { 123 - match agent.get_record::<Fs>(&uri).await { 124 - Ok(response) => { 125 - match response.into_output() { 126 - Ok(record_output) => { 127 - let existing_manifest = record_output.value; 128 - let blob_map = blob_map::extract_blob_map(&existing_manifest.root); 129 - println!("Found existing manifest with {} files, checking for changes...", blob_map.len()); 130 - blob_map 131 - } 132 - Err(_) => { 133 - println!("No existing manifest found, uploading all files..."); 134 - HashMap::new() 135 - } 136 - } 137 - } 138 - Err(_) => { 139 - // Record doesn't exist yet - this is a new site 140 - println!("No existing manifest found, uploading all files..."); 141 - HashMap::new() 142 - } 143 - } 144 - } else { 145 - println!("No existing manifest found (invalid URI), uploading all files..."); 146 - HashMap::new() 147 - } 148 - } else { 149 - println!("No existing manifest found (could not get DID), uploading all files..."); 150 - HashMap::new() 151 - } 152 - }; 153 112 113 + // Count total files 114 + let file_count = count_files(&root_dir); 154 - // Build directory tree 155 - let (root_dir, total_files, reused_count) = build_directory(agent, &path, &existing_blob_map).await?; 156 - let uploaded_count = total_files - reused_count; 157 115 158 116 // Create the Fs record 159 117 let fs_record = Fs::new() 160 118 .site(CowStr::from(site_name.clone())) 161 119 .root(root_dir) 120 + .file_count(file_count as i64) 162 - .file_count(total_files as i64) 163 121 .created_at(Datetime::now()) 164 122 .build(); 165 123 ··· 174 132 .and_then(|s| s.split('/').next()) 175 133 .ok_or_else(|| miette::miette!("Failed to parse DID from URI"))?; 176 134 135 + println!("Deployed site '{}': {}", site_name, output.uri); 136 + println!("Available at: https://sites.wisp.place/{}/{}", did, site_name); 177 - println!("\n✓ Deployed site '{}': {}", site_name, output.uri); 178 - println!(" Total files: {} ({} reused, {} uploaded)", total_files, reused_count, uploaded_count); 179 - println!(" Available at: https://sites.wisp.place/{}/{}", did, site_name); 180 137 181 138 Ok(()) 182 139 } ··· 185 142 fn build_directory<'a>( 186 143 agent: &'a Agent<impl jacquard::client::AgentSession + IdentityResolver + 'a>, 187 144 dir_path: &'a Path, 145 + ) -> std::pin::Pin<Box<dyn std::future::Future<Output = miette::Result<Directory<'static>>> + 'a>> 188 - existing_blobs: &'a HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)>, 189 - ) -> std::pin::Pin<Box<dyn std::future::Future<Output = miette::Result<(Directory<'static>, usize, usize)>> + 'a>> 190 146 { 191 147 Box::pin(async move { 192 148 // Collect all directory entries first ··· 221 177 } 222 178 223 179 // Process files concurrently with a limit of 5 180 + let file_entries: Vec<Entry> = stream::iter(file_tasks) 224 - let file_results: Vec<(Entry<'static>, bool)> = stream::iter(file_tasks) 225 181 .map(|(name, path)| async move { 182 + let file_node = process_file(agent, &path).await?; 183 + Ok::<_, miette::Report>(Entry::new() 226 - let (file_node, reused) = process_file(agent, &path, &name, existing_blobs).await?; 227 - let entry = Entry::new() 228 184 .name(CowStr::from(name)) 229 185 .node(EntryNode::File(Box::new(file_node))) 186 + .build()) 230 - .build(); 231 - Ok::<_, miette::Report>((entry, reused)) 232 187 }) 233 188 .buffer_unordered(5) 234 189 .collect::<Vec<_>>() 235 190 .await 236 191 .into_iter() 237 192 .collect::<miette::Result<Vec<_>>>()?; 238 - 239 - let mut file_entries = Vec::new(); 240 - let mut reused_count = 0; 241 - let mut total_files = 0; 242 - 243 - for (entry, reused) in file_results { 244 - file_entries.push(entry); 245 - total_files += 1; 246 - if reused { 247 - reused_count += 1; 248 - } 249 - } 250 193 251 194 // Process directories recursively (sequentially to avoid too much nesting) 252 195 let mut dir_entries = Vec::new(); 253 196 for (name, path) in dir_tasks { 197 + let subdir = build_directory(agent, &path).await?; 254 - let (subdir, sub_total, sub_reused) = build_directory(agent, &path, existing_blobs).await?; 255 198 dir_entries.push(Entry::new() 256 199 .name(CowStr::from(name)) 257 200 .node(EntryNode::Directory(Box::new(subdir))) 258 201 .build()); 259 - total_files += sub_total; 260 - reused_count += sub_reused; 261 202 } 262 203 263 204 // Combine file and directory entries 264 205 let mut entries = file_entries; 265 206 entries.extend(dir_entries); 266 207 208 + Ok(Directory::new() 267 - let directory = Directory::new() 268 209 .r#type(CowStr::from("directory")) 269 210 .entries(entries) 211 + .build()) 270 - .build(); 271 - 272 - Ok((directory, total_files, reused_count)) 273 212 }) 274 213 } 275 214 215 + /// Process a single file: gzip -> base64 -> upload blob 276 - /// Process a single file: gzip -> base64 -> upload blob (or reuse existing) 277 - /// Returns (File, reused: bool) 278 216 async fn process_file( 279 217 agent: &Agent<impl jacquard::client::AgentSession + IdentityResolver>, 280 218 file_path: &Path, 219 + ) -> miette::Result<File<'static>> 281 - file_name: &str, 282 - existing_blobs: &HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)>, 283 - ) -> miette::Result<(File<'static>, bool)> 284 220 { 285 221 // Read file 286 222 let file_data = std::fs::read(file_path).into_diagnostic()?; ··· 298 234 // Base64 encode the gzipped data 299 235 let base64_bytes = base64::prelude::BASE64_STANDARD.encode(&gzipped).into_bytes(); 300 236 237 + // Upload blob as octet-stream 301 - // Compute CID for this file (CRITICAL: on base64-encoded gzipped content) 302 - let file_cid = cid::compute_cid(&base64_bytes); 303 - 304 - // Normalize the file path for comparison 305 - let normalized_path = blob_map::normalize_path(file_name); 306 - 307 - // Check if we have an existing blob with the same CID 308 - let existing_blob = existing_blobs.get(&normalized_path) 309 - .or_else(|| existing_blobs.get(file_name)); 310 - 311 - if let Some((existing_blob_ref, existing_cid)) = existing_blob { 312 - if existing_cid == &file_cid { 313 - // CIDs match - reuse existing blob 314 - println!(" ✓ Reusing blob for {} (CID: {})", file_name, file_cid); 315 - return Ok(( 316 - File::new() 317 - .r#type(CowStr::from("file")) 318 - .blob(existing_blob_ref.clone()) 319 - .encoding(CowStr::from("gzip")) 320 - .mime_type(CowStr::from(original_mime)) 321 - .base64(true) 322 - .build(), 323 - true 324 - )); 325 - } 326 - } 327 - 328 - // File is new or changed - upload it 329 - println!(" ↑ Uploading {} ({} bytes, CID: {})", file_name, base64_bytes.len(), file_cid); 330 238 let blob = agent.upload_blob( 331 239 base64_bytes, 332 240 MimeType::new_static("application/octet-stream"), 333 241 ).await?; 334 242 243 + Ok(File::new() 244 + .r#type(CowStr::from("file")) 245 + .blob(blob) 246 + .encoding(CowStr::from("gzip")) 247 + .mime_type(CowStr::from(original_mime)) 248 + .base64(true) 249 + .build()) 335 - Ok(( 336 - File::new() 337 - .r#type(CowStr::from("file")) 338 - .blob(blob) 339 - .encoding(CowStr::from("gzip")) 340 - .mime_type(CowStr::from(original_mime)) 341 - .base64(true) 342 - .build(), 343 - false 344 - )) 345 250 } 346 251 252 + /// Count total files in a directory tree 253 + fn count_files(dir: &Directory) -> usize { 254 + let mut count = 0; 255 + for entry in &dir.entries { 256 + match &entry.node { 257 + EntryNode::File(_) => count += 1, 258 + EntryNode::Directory(subdir) => count += count_files(subdir), 259 + _ => {} // Unknown variants 260 + } 261 + } 262 + count 263 + }