···1010 <br />
1111</div>
12121313-Leveraging the power of sticky regexes and Babel code generation, `reghex` allows
1313+Leveraging the power of sticky regexes and JS code generation, `reghex` allows
1414you to code parsers quickly, by surrounding regular expressions with a regex-like
1515[DSL](https://en.wikipedia.org/wiki/Domain-specific_language).
1616···3030npm install --save reghex
3131```
32323333-##### 2. Add the plugin to your Babel configuration (`.babelrc`, `babel.config.js`, or `package.json:babel`)
3333+##### 2. Add the plugin to your Babel configuration _(optional)_
3434+3535+In your `.babelrc`, `babel.config.js`, or `package.json:babel` add:
34363537```json
3638{
···4143Alternatively, you can set up [`babel-plugin-macros`](https://github.com/kentcdodds/babel-plugin-macros) and
4244import `reghex` from `"reghex/macro"` instead.
43454646+This step is **optional**. `reghex` can also generate its optimised JS code during runtime.
4747+This will only incur a tiny parsing cost on initialisation, but due to the JIT of modern
4848+JS engines there won't be any difference in performance between pre-compiled and compiled
4949+versions otherwise.
5050+5151+Since the `reghex` runtime is rather small, for larger grammars it may even make sense not
5252+to precompile the matchers at all. For this case you may pass the `{ "codegen": false }`
5353+option to the Babel plugin, which will minify the `reghex` matcher templates without
5454+precompiling them.
5555+4456##### 3. Have fun writing parsers!
45574658```js
4747-import match, { parse } from 'reghex';
5959+import { match, parse } from 'reghex';
48604961const name = match('name')`
5062 ${/\w+/}
···99111100112## Authoring Guide
101113102102-You can write "matchers" by importing the default import from `reghex` and
114114+You can write "matchers" by importing the `match` import from `reghex` and
103115using it to write a matcher expression.
104116105117```js
106106-import match from 'reghex';
118118+import { match } from 'reghex';
107119108120const name = match('name')`
109121 ${/\w+/}
110122`;
111123```
112124113113-As can be seen above, the `match` function, which is what we've called the
114114-default import, is called with a "node name" and is then called as a tagged
115115-template. This template is our **parsing definition**.
125125+As can be seen above, the `match` function, is called with a "node name" and
126126+is then called as a tagged template. This template is our **parsing definition**.
116127117128`reghex` functions only with its Babel plugin, which will detect `match('name')`
118129and replace the entire tag with a parsing function, which may then look like
···161172Let's extend our original example;
162173163174```js
164164-import match from 'reghex';
175175+import { match } from 'reghex';
165176166177const name = match('name')`
167178 ${/\w+/}
···193204*/
194205```
195206207207+Furthermore, interpolations don't have to just be RegHex matchers. They can
208208+also be functions returning matchers or completely custom matching functions.
209209+This is useful when your DSL becomes _self-referential_, i.e. when one matchers
210210+start referencing each other forming a loop. To fix this we can create a
211211+function that returns our root matcher:
212212+213213+```js
214214+import { match } from 'reghex';
215215+216216+const value = match('value')`
217217+ (${/\w+/} | ${() => root})+
218218+`;
219219+220220+const root = match('root')`
221221+ ${/root/}+ ${value}
222222+`;
223223+```
224224+196225### Regex-like DSL
197226198227We've seen in the previous examples that matchers are authored using tagged
···208237in the parsed string. This is just one feature of the regex-like DSL. The
209238available operators are the following:
210239211211-| Operator | Example | Description |
212212-| -------- | ------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
213213-| `?` | `${/1/}?` | An **optional** may be used to make an interpolation optional. This means that the interpolation may or may not match. |
214214-| `*` | `${/1/}*` | A **star** can be used to match an arbitrary amount of interpolation or none at all. This means that the interpolation may repeat itself or may not be matched at all. |
215215-| `+` | `${/1/}+` | A **plus** is used like `*` and must match one or more times. When the matcher doesn't match, that's considered a failing case, since the match isn't optional. |
216216-| `\|` | `${/1/} \| ${/2/}` | An **alternation** can be used to match either one thing or another, falling back when the first interpolation fails. |
217217-| `()` | `(${/1/} ${/2/})+` | A **group** can be used to apply one of the other operators to an entire group of interpolations. |
240240+| Operator | Example | Description |
241241+| -------- | ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
242242+| `?` | `${/1/}?` | An **optional** may be used to make an interpolation optional. This means that the interpolation may or may not match. |
243243+| `*` | `${/1/}*` | A **star** can be used to match an arbitrary amount of interpolation or none at all. This means that the interpolation may repeat itself or may not be matched at all. |
244244+| `+` | `${/1/}+` | A **plus** is used like `*` and must match one or more times. When the matcher doesn't match, that's considered a failing case, since the match isn't optional. |
245245+| `\|` | `${/1/} \| ${/2/}` | An **alternation** can be used to match either one thing or another, falling back when the first interpolation fails. |
246246+| `()` | `(${/1/} ${/2/})+` | A **group** can be used to apply one of the other operators to an entire group of interpolations. |
218247| `(?: )` | `(?: ${/1/})` | A **non-capturing group** is like a regular group, but the interpolations matched inside it don't appear in the parser's output. |
219219-| `(?= )` | `(?= ${/1/})` | A **positive lookahead** checks whether interpolations match, and if so continues the matcher without changing the input. If it matches, it's essentially ignored. |
248248+| `(?= )` | `(?= ${/1/})` | A **positive lookahead** checks whether interpolations match, and if so continues the matcher without changing the input. If it matches, it's essentially ignored. |
220249| `(?! )` | `(?! ${/1/})` | A **negative lookahead** checks whether interpolations _don't_ match, and if so continues the matcher without changing the input. If the interpolations do match the matcher is aborted. |
250250+251251+A couple of operators also support "short hands" that allow you to write
252252+lookaheads or non-capturing groups a little quicker.
253253+254254+| Shorthand | Example | Description |
255255+| --------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
256256+| `:` | `:${/1/}` | A **non-capturing group** is like a regular group, but the interpolations matched inside it don't appear in the parser's output. |
257257+| `=` | `=${/1/}` | A **positive lookahead** checks whether interpolations match, and if so continues the matcher without changing the input. If it matches, it's essentially ignored. |
258258+| `!` | `!${/1/}` | A **negative lookahead** checks whether interpolations _don't_ match, and if so continues the matcher without changing the input. If the interpolations do match the matcher is aborted. |
221259222260We can combine and compose these operators to create more complex matchers.
223261For instance, we can extend the original example to only allow a specific set
···316354317355We've now entirely changed the output of the parser for this matcher. Given that each
318356matcher can change its output, we're free to change the parser's output entirely.
319319-By **returning a falsy value** in this matcher, we can also change the matcher to not have
320320-matched, which would cause other matchers to treat it like a mismatch!
357357+By returning `null` or `undefined` in this matcher, we can also change the matcher
358358+to not have matched, which would cause other matchers to treat it like a mismatch!
321359322360```js
323323-import match, { parse } from 'reghex';
361361+import { match, parse } from 'reghex';
324362325363const name = match('name')((x) => {
326364 return x[0] !== 'tim' ? x : undefined;
···345383tag(['test'], 'node_name');
346384// ["test", .tag = "node_name"]
347385```
386386+387387+### Tagged Template Parsing
388388+389389+Any grammar in RegHex can also be used to parse a tagged template literal.
390390+A tagged template literal consists of a list of literals alternating with
391391+a list of "interpolations".
392392+393393+In RegHex we can add an `interpolation` matcher to our grammars to allow it
394394+to parse interpolations in a template literal.
395395+396396+```js
397397+import { interpolation } from 'reghex';
398398+399399+const anyNumber = interpolation((x) => typeof x === 'number');
400400+401401+const num = match('num')`
402402+ ${/[+-]?/} ${anyNumber}
403403+`;
404404+405405+parse(num)`+${42}`;
406406+// ["+", 42, .tag = "num"]
407407+```
408408+409409+This grammar now allows us to match arbitrary values if they're input into the
410410+parser. We can now call our grammar using a tagged template literal themselves
411411+to parse this.
348412349413**That's it! May the RegExp be ever in your favor.**