# html5rw Pure OCaml HTML5 parser compiled to JavaScript and WebAssembly via js_of_ocaml. **Note: This package is browser-only.** It uses DOM APIs and browser events for initialization and cannot be used in Node.js. This is a fully compliant HTML5 parser implementing the [WHATWG HTML5 specification](https://html.spec.whatwg.org/multipage/parsing.html), passing the html5lib-tests conformance suite. It is based on transpiling into OCaml. ## Installation ```bash npm install html5rw-jsoo ``` ## Usage (Browser Only) ### JavaScript Version ```html ``` ### WebAssembly Version ```html ``` ### Web Worker (Background Validation) For non-blocking HTML validation in a separate thread: ```javascript const worker = new Worker('node_modules/html5rw/htmlrw-worker.js'); worker.onmessage = (e) => { console.log('Validation result:', e.data); }; worker.postMessage({ html: '

Hello' }); ``` WASM version: ```javascript const worker = new Worker('node_modules/html5rw/htmlrw-worker.wasm.js'); ``` ## Files Included | File | Description | |------|-------------| | `htmlrw.js` | Main library (JavaScript) | | `htmlrw.wasm.js` | Main library (WebAssembly loader) | | `htmlrw-worker.js` | Web Worker (JavaScript) | | `htmlrw-worker.wasm.js` | Web Worker (WebAssembly loader) | | `htmlrw-tests.js` | Browser test runner (JavaScript) | | `htmlrw-tests.wasm.js` | Browser test runner (WebAssembly loader) | | `htmlrw_js_main.bc.wasm.assets/` | WASM modules for main library | | `htmlrw_js_worker.bc.wasm.assets/` | WASM modules for web worker | | `htmlrw_js_tests_main.bc.wasm.assets/` | WASM modules for test runner | ## Features - Full HTML5 parsing per WHATWG specification - Encoding detection and conversion - Error recovery (like browsers) - CSS selector queries - DOM manipulation - HTML serialization ## Source Code The OCaml source code is available on the `main` branch: https://tangled.org/anil.recoil.org/ocaml-html5rw ## License MIT