Markdown parser fork with extended syntax for personal use.
1//! Constructs found in markdown.
2//!
3//! Constructs are grouped by content type.
4//! Which content type is allowed somewhere, prescribes which constructs are
5//! allowed there.
6//!
7//! ## Content type
8//!
9//! The following content types are found in markdown:
10//!
11//! * [document][]
12//! * [flow][]
13//! * [string][]
14//! * [text][]
15//!
16//! Content types also have a *rest* thing: after all things are parsed,
17//! there’s something left.
18//! In document, that is [flow][].
19//! In flow, that is [content][].
20//! In string and text, that is [data][partial_data].
21//!
22//! ## Construct
23//!
24//! There are several *things* found when parsing markdown, such as, say, a
25//! thematic break.
26//! These things are called constructs here.
27//!
28//! Sometimes, there are several constructs that result in an equivalent thing.
29//! For example, [code (fenced)][raw_flow] and
30//! [code (indented)][code_indented] are considered different constructs.
31//! Sometimes, constructs on their own don’t result in anything.
32//! For example, a `*` is parsed as an attention sequence, but later when we
33//! didn’t find another sequence, it’s turned back into plain data.
34//!
35//! The following constructs are found in markdown (`CommonMark`):
36//!
37//! * [attention][] (strong, emphasis, extension: GFM strikethrough)
38//! * [autolink][]
39//! * [blank line][blank_line]
40//! * [block quote][block_quote]
41//! * [character escape][character_escape]
42//! * [character reference][character_reference]
43//! * [code (indented)][code_indented]
44//! * [content][]
45//! * [definition][]
46//! * [hard break (escape)][hard_break_escape]
47//! * [heading (atx)][heading_atx]
48//! * [heading (setext)][heading_setext]
49//! * [html (flow)][html_flow]
50//! * [html (text)][html_text]
51//! * [label end][label_end]
52//! * [label start (image)][label_start_image]
53//! * [label start (link)][label_start_link]
54//! * [list item][list_item]
55//! * [paragraph][]
56//! * [raw (flow)][raw_flow] (code (fenced), extensions: math (flow))
57//! * [raw (text)][raw_text] (code (text), extensions: math (text))
58//! * [thematic break][thematic_break]
59//!
60//! > 👉 **Note**: for performance reasons, hard break (trailing) is formed by
61//! > [whitespace][partial_whitespace].
62//!
63//! The following constructs are extensions found in markdown:
64//!
65//! * [frontmatter][]
66//! * [gfm autolink literal][gfm_autolink_literal]
67//! * [gfm footnote definition][gfm_footnote_definition]
68//! * [gfm label start footnote][gfm_label_start_footnote]
69//! * [gfm table][gfm_table]
70//! * [gfm task list item check][gfm_task_list_item_check]
71//! * [mdx esm][mdx_esm]
72//! * [mdx expression (flow)][mdx_expression_flow]
73//! * [mdx expression (text)][mdx_expression_text]
74//! * [mdx jsx (flow)][mdx_jsx_flow]
75//! * [mdx jsx (text)][mdx_jsx_text]
76//!
77//! There are also several small subroutines typically used in different places:
78//!
79//! * [bom][partial_bom]
80//! * [data][partial_data]
81//! * [destination][partial_destination]
82//! * [label][partial_label]
83//! * [mdx expression][partial_mdx_expression]
84//! * [mdx jsx][partial_mdx_jsx]
85//! * [non lazy continuation][partial_non_lazy_continuation]
86//! * [space or tab][partial_space_or_tab]
87//! * [space or tab, eol][partial_space_or_tab_eol]
88//! * [title][partial_title]
89//! * [whitespace][partial_whitespace]
90//!
91//! ## Grammar
92//!
93//! Each construct maintained here is explained with a BNF diagram.
94//!
95//! Such diagrams are considered to be *non-normative*.
96//! That is to say, they form illustrative, imperfect, but useful, examples.
97//! The code, in Rust, is considered to be normative.
98//!
99//! The actual syntax of markdown can be described in Backus–Naur form (BNF) as:
100//!
101//! ```bnf
102//! markdown = .*
103//! ```
104//!
105//! No, that’s [not a typo][bnf]: markdown has no syntax errors; anything
106//! thrown at it renders *something*.
107//!
108//! These diagrams contain references to character group as defined by Rust on
109//! for example [char][], but also often on [u8][], which is what `micromark-rs`
110//! typically works on.
111//! So, for example, `ascii_punctuation` refers to
112//! [`u8::is_ascii_punctuation`][u8::is_ascii_punctuation].
113//!
114//! For clarity, the productions used throughout are:
115//!
116//! ```bnf
117//! ; Rust / ASCII groups:
118//! ; 'a'..='z'
119//! ascii_lowercase ::= 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | 'g' | 'h' | 'i' | 'j' | 'k' | 'l' | 'm' | 'n' | 'o' | 'p' | 'q' | 'r' | 's' | 't' | 'u' | 'v' | 'w' | 'x' | 'y' | 'z'
120//! ; 'A'..='Z'
121//! ascii_uppercase ::= 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' | 'H' | 'I' | 'J' | 'K' | 'L' | 'M' | 'N' | 'O' | 'P' | 'Q' | 'R' | 'S' | 'T' | 'U' | 'V' | 'W' | 'X' | 'Y' | 'Z'
122//! ; 'A'..='Z', 'a'..='z'
123//! ascii_alphabetic ::= ascii_lowercase | ascii_uppercase
124//! ; '0'..='9'
125//! ascii_digit ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
126//! ; '0'..='9', 'A'..='F', 'a'..='f'
127//! ascii_hexdigit ::= ascii_digit | 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | 'A' | 'B' | 'C' | 'D' | 'E' | 'F'
128//! ; '0'..='9', 'A'..='Z', 'a'..='z'
129//! ascii_alphanumeric ::= ascii_digit | ascii_alphabetic
130//! ; '!'..='/', ':'..='@', '['..='`', '{'..='~'
131//! ascii_punctuation ::= '!' | '"' | '#' | '$' | '%' | '&' | '\'' | '(' | ')' | '*' | '+' | ',' | '-' | '.' | '/' | ':' | ';' | '<' | '=' | '>' | '?' | '@' | '[' | '\' | ']' | '^' | '_' | '`' | '{' | '|' | '}' | '~'
132//! ; 0x00..=0x1F, 0x7F
133//! ascii_control ::= 0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 | 0x08 | 0x09 | 0x0A | 0x0B | 0x0C | 0x0D | 0x0E | 0x0F | 0x10 | 0x11 | 0x12 | 0x13 | 0x14 | 0x15 | 0x16 | 0x17 | 0x18 | 0x19 | 0x1A | 0x1B | 0x1C | 0x1D | 0x1E | 0x1F | 0x7F
134//!
135//! ; Markdown groups:
136//! ; Any byte (u8)
137//! byte ::= 0x00..=0xFFFF
138//! space_or_tab ::= '\t' | ' '
139//! eol ::= '\n' | '\r' | '\r\n'
140//! line ::= byte - eol
141//! text ::= line - space_or_tab
142//! space_or_tab_eol ::= 1*space_or_tab | *space_or_tab eol *space_or_tab
143//!
144//! ; Unicode groups:
145//! unicode_whitespace ::= ? ; See `char::is_whitespace`.
146//! unicode_punctuation ::= ? ; See `src/unicode.rs`.
147//! ```
148//!
149//! [bnf]: http://trevorjim.com/a-specification-for-markdown/
150
151pub mod attention;
152pub mod autolink;
153pub mod blank_line;
154pub mod block_quote;
155pub mod character_escape;
156pub mod character_reference;
157pub mod code_indented;
158pub mod content;
159pub mod definition;
160pub mod document;
161pub mod flow;
162pub mod frontmatter;
163pub mod gfm_autolink_literal;
164pub mod gfm_footnote_definition;
165pub mod gfm_label_start_footnote;
166pub mod gfm_table;
167pub mod gfm_task_list_item_check;
168pub mod hard_break_escape;
169pub mod heading_atx;
170pub mod heading_setext;
171pub mod html_flow;
172pub mod html_text;
173pub mod label_end;
174pub mod label_start_image;
175pub mod label_start_link;
176pub mod list_item;
177pub mod mdx_esm;
178pub mod mdx_expression_flow;
179pub mod mdx_expression_text;
180pub mod mdx_jsx_flow;
181pub mod mdx_jsx_text;
182pub mod paragraph;
183pub mod partial_bom;
184pub mod partial_data;
185pub mod partial_destination;
186pub mod partial_label;
187pub mod partial_mdx_expression;
188pub mod partial_mdx_jsx;
189pub mod partial_non_lazy_continuation;
190pub mod partial_space_or_tab;
191pub mod partial_space_or_tab_eol;
192pub mod partial_title;
193pub mod partial_whitespace;
194pub mod raw_flow;
195pub mod raw_text;
196pub mod string;
197pub mod text;
198pub mod thematic_break;
199pub mod wikilink;
200pub mod wikilink_label;