we (web engine): Experimental web browser project to understand the limits of Claude

JS Built-in: RegExp engine #100

open opened by pierrelf.com

Implement the RegExp built-in with a regular expression engine.

Scope#

Build a regex engine from scratch (no external crates) and wire it into the RegExp built-in.

RegExp Parser#

  • Parse regex patterns into an internal representation (NFA or direct bytecode)
  • Character classes: [abc], [a-z], [^abc], predefined (\d, \w, \s, .)
  • Quantifiers: *, +, ?, {n}, {n,}, {n,m} (greedy and lazy ?)
  • Anchors: ^, $, \b, \B
  • Groups: capturing (), non-capturing (?:), named (?<name>)
  • Alternation: |
  • Backreferences: \1, \k<name>
  • Escape sequences: \n, \t, \uXXXX, \xHH
  • Lookahead: (?=), (?!) (lookbehind can be deferred)

Regex Engine#

  • NFA-based or backtracking matcher
  • Support greedy and lazy quantifiers
  • Capture group extraction
  • Global and sticky matching (track lastIndex)

RegExp Built-in#

  • RegExp(pattern, flags) constructor
  • Flags: g (global), i (ignoreCase), m (multiline), s (dotAll), u (unicode), y (sticky)
  • RegExp.prototype.test(string) — returns boolean
  • RegExp.prototype.exec(string) — returns match array or null
  • RegExp.prototype.toString() — "/pattern/flags"
  • Properties: source, flags, global, ignoreCase, multiline, dotAll, unicode, sticky, lastIndex

String RegExp Integration#

  • String.prototype.match(regexp)
  • String.prototype.matchAll(regexp)
  • String.prototype.replace(regexp, replacement)
  • String.prototype.search(regexp)
  • String.prototype.split(regexp)

Acceptance Criteria#

  • Regex parser handles all pattern syntax
  • Engine matches character classes, quantifiers, anchors correctly
  • Capture groups return correct matches
  • Flags (g, i, m, s) work correctly
  • RegExp.prototype.exec and test work
  • String methods (match, replace, search, split) work with RegExp
  • Global matching with lastIndex advancement
  • Unit tests with diverse regex patterns

Phase 10 — JavaScript Engine (issue 11 of 15). Depends on: JS primitive built-ins.

sign up or login to add to the discussion
Labels

None yet.

assignee

None yet.

Participants 1
AT URI
at://did:plc:meotu43t6usg4qdwzenk4s2t/sh.tangled.repo.issue/3mhn3o5n5fm2l