we (web engine): Experimental web browser project to understand the limits of Claude

Data URLs (data: scheme parsing and loading) #76

open opened by pierrelf.com

Phase 8 — Resource Loading + Character Encoding + Real Page Loading#

Implement data URL parsing and loading per RFC 2397.

Requirements#

  • Parse data:[<mediatype>][;base64],<data> URLs
  • Extract MIME type (default text/plain;charset=US-ASCII if omitted)
  • Handle base64-encoded data: decode base64 payload
  • Handle percent-encoded data: decode percent-encoded payload
  • Integrate with ResourceLoader: data URLs should be handled locally without network fetch
  • Support data URLs in <img src="data:..."> and <link href="data:..."> contexts

Implementation#

Add to the url crate or a shared utility:

pub struct DataUrl {
    pub mime_type: String,
    pub charset: Option<String>,
    pub data: Vec<u8>,
}

pub fn parse_data_url(url: &str) -> Result<DataUrl, DataUrlError>;

Base64 decoder (RFC 4648):

  • Standard alphabet (A-Z, a-z, 0-9, +, /)
  • Padding with '='
  • Ignore whitespace in encoded data

Acceptance Criteria#

  • Parse data URLs with and without base64 encoding
  • Correct MIME type extraction with defaults
  • Base64 decoder handles standard alphabet and padding
  • Percent-decoding works for non-base64 data URLs
  • Integration with ResourceLoader returns decoded data
  • No external dependencies, no unsafe
  • Unit tests: various data URL formats, edge cases (empty data, missing MIME type)

Dependencies#

Depends on: Resource loader

sign up or login to add to the discussion
Labels

None yet.

assignee

None yet.

Participants 1
AT URI
at://did:plc:meotu43t6usg4qdwzenk4s2t/sh.tangled.repo.issue/3mhkt7m45472x