Markdown parser fork with extended syntax for personal use.
1//! Character escapes occur in the [string][] and [text][] content types.
2//!
3//! ## Grammar
4//!
5//! Character escapes form with the following BNF
6//! (<small>see [construct][crate::construct] for character groups</small>):
7//!
8//! ```bnf
9//! character_escape ::= '\\' ascii_punctuation
10//! ```
11//!
12//! Like much of markdown, there are no “invalid” character escapes: just a
13//! slash, or a slash followed by anything other than an ASCII punctuation
14//! character, is just a slash.
15//!
16//! To escape other characters, use a [character reference][character_reference]
17//! instead (as in, `&`, `{`, or say `	`).
18//!
19//! It is also possible to escape a line ending in text with a similar
20//! construct: a [hard break (escape)][hard_break_escape] is a backslash followed
21//! by a line ending (that is part of the construct instead of ending it).
22//!
23//! ## Recommendation
24//!
25//! If possible, use a character escape.
26//! Otherwise, use a character reference.
27//!
28//! ## Tokens
29//!
30//! * [`CharacterEscape`][Name::CharacterEscape]
31//! * [`CharacterEscapeMarker`][Name::CharacterEscapeMarker]
32//! * [`CharacterEscapeValue`][Name::CharacterEscapeValue]
33//!
34//! ## References
35//!
36//! * [`character-escape.js` in `micromark`](https://github.com/micromark/micromark/blob/main/packages/micromark-core-commonmark/dev/lib/character-escape.js)
37//! * [*§ 2.4 Backslash escapes* in `CommonMark`](https://spec.commonmark.org/0.31/#backslash-escapes)
38//!
39//! [string]: crate::construct::string
40//! [text]: crate::construct::text
41//! [character_reference]: crate::construct::character_reference
42//! [hard_break_escape]: crate::construct::hard_break_escape
43
44use crate::event::Name;
45use crate::state::{Name as StateName, State};
46use crate::tokenizer::Tokenizer;
47
48/// Start of character escape.
49///
50/// ```markdown
51/// > | a\*b
52/// ^
53/// ```
54pub fn start(tokenizer: &mut Tokenizer) -> State {
55 if tokenizer.parse_state.options.constructs.character_escape && tokenizer.current == Some(b'\\')
56 {
57 tokenizer.enter(Name::CharacterEscape);
58 tokenizer.enter(Name::CharacterEscapeMarker);
59 tokenizer.consume();
60 tokenizer.exit(Name::CharacterEscapeMarker);
61 State::Next(StateName::CharacterEscapeInside)
62 } else {
63 State::Nok
64 }
65}
66
67/// After `\`, at punctuation.
68///
69/// ```markdown
70/// > | a\*b
71/// ^
72/// ```
73pub fn inside(tokenizer: &mut Tokenizer) -> State {
74 match tokenizer.current {
75 // ASCII punctuation.
76 Some(b'!'..=b'/' | b':'..=b'@' | b'['..=b'`' | b'{'..=b'~') => {
77 tokenizer.enter(Name::CharacterEscapeValue);
78 tokenizer.consume();
79 tokenizer.exit(Name::CharacterEscapeValue);
80 tokenizer.exit(Name::CharacterEscape);
81 State::Ok
82 }
83 _ => State::Nok,
84 }
85}