Resource loader: fetch and decode HTML pages #73

Phase 8 — Resource Loading + Character Encoding + Real Page Loading#

Implement the core resource loading infrastructure that fetches URLs and decodes their content.

Requirements#

ResourceLoader struct that wraps the net crate's HttpClient
Fetch a URL: given a URL string, resolve it, fetch via HTTP/HTTPS, and return the decoded body
Content-Type handling: parse the response Content-Type to determine MIME type and charset
Text decoding: use the encoding crate to decode response bytes to text (for HTML, CSS, etc.)
Binary resources: return raw bytes for images and other binary content
Error handling: network errors, DNS failures, TLS errors, HTTP errors (4xx, 5xx)
Base URL tracking: track the document's base URL for resolving relative URLs

API#

pub struct ResourceLoader {
    client: HttpClient,
}

pub enum Resource {
    Html { text: String, base_url: Url, encoding: Encoding },
    Css { text: String, url: Url },
    Image { data: Vec<u8>, mime_type: String, url: Url },
    Other { data: Vec<u8>, mime_type: String, url: Url },
}

impl ResourceLoader {
    pub fn new() -> Self;
    pub fn fetch(&mut self, url: &Url) -> Result<Resource, LoadError>;
}

Acceptance Criteria#

Can fetch HTML pages over HTTP and HTTPS
Content-Type parsing extracts MIME type and charset
Text resources are decoded using detected/specified encoding
Binary resources (images) returned as raw bytes
Relative URL resolution works with base URL
Error types cover network, protocol, and HTTP status errors
No external dependencies, no unsafe
Integration tests (can be skipped in CI if no network)

Dependencies#

Depends on: Encoding sniffing (for charset detection), net crate (already complete)