write literate haskell programs in typst
cdn.oppi.li/typst-unlit.pdf
haskell
typst
1#set document(title: [Typst-Unlit])
2#set par(justify: true)
3#show raw.where(lang: "haskell"): set align(center)
4#show raw.where(lang: "haskell-top"): set align(center)
5#show title: set align(center)
6#show <subtitle>: set align(center)
7
8#title()
9
10Write literate Haskell programs in Typst <subtitle>
11
12_#link("https://tangled.org/@oppi.li/typst-unlit")[tangled.org/\@oppi.li/typst-unlit]_ <subtitle>
13
14*Serves: 1 #h(20pt) Prep Time: 10min #h(20pt) Compile Time: 10ms* <subtitle>
15
16A literate program is one where comments are first-class citizens, and code is explicitly demarcated, as opposed to a regular program, where comments are explicitly marked, and code is a first-class entity.
17
18GHC supports literate programming out of the box, by using a preprocessor to extract code from documents. This preprocessor is known as _unlit_ #footnote[https://gitlab.haskell.org/ghc/ghc/-/tree/master/utils/unlit]. GHC also supports _custom_ preprocessors, which can be passed in via the `-pgmL` flag. This very document you are reading, is one such preprocessor program that allows extracting Haskell from Typst code (although it has been rendered to HTML, PDF or markdown depending on where you are reading it)#footnote[This document needs itself to compile itself! This is why a bootstrap program is included.]!
19
20This recipe not only gives you a fish (the typst-unlit preprocessor), but also, teaches you how to fish (write your own preprocessors).
21
22= Ingredients
23
24#table(
25 columns: (1fr, 1fr),
26 gutter: 3pt,
27 stroke: none,
28 table.cell(inset: 10pt)[
29 To write your own preprocessor:
30 - GHC: the Glorious Haskell Compiler
31 - Typst: to generate PDFs
32 - And thats it! No stacking, shaking or caballing here.
33 ],
34 table.cell(inset: 10pt)[
35 To compile this very document:
36 - The bootstrap program
37 - GHC: to produce an executable program
38 - Typst: to produce a readable PDF
39 ],
40)
41
42*Pro Tip:* If you're missing any ingredients, your local nixpkgs should stock them!
43
44= Instructions
45
46The idea behind the unlit program is super simple: iterate over the lines in the supplied input file and replace lines that aren't Haskell with an empty line! To detect lines that are Haskell, we look for the #raw("\u{0060}\u{0060}\u{0060}haskell") directive and stop at the end of the code fence. Simple enough! Annoyingly, Haskell requires that imports be declared at the top of the file. This results in literate Haskell programs always starting with a giant block of imports:
47
48#set quote(block: true)
49#quote(attribution: [Every literate programmer])[
50```
51 -- So first we need to get some boilerplate and imports out of the way.
52```
53]
54
55Oh gee, if only we had a tool to put the important stuff first. Our preprocessor will remedy this wart, with the `haskell-top` directive to move blocks to the top. With that out of the way, lets move onto the program itself!
56
57#pagebreak()
58
59== Step 1: The maincourse
60
61I prefer starting with `main` but you do you. Any program that is passed to `ghc -pgmL` has to accept exactly 4 arguments:
62
63- `-h`: ignore this for now
64- `<label>`: ignore this for now
65- `<infile>`: the input lhaskell source code
66- `<outfile>`: the output Haskell source code
67
68Invoke the runes to handle CLI arguments:
69
70```haskell
71main = do
72 args <- getArgs
73 case args of
74 ["-h", _label, infile, outfile] -> process infile outfile
75 _ -> die "Usage: typst-unlit -h <label> <source> <destination>"
76```
77
78You will need these imports accordingly (notice how I am writing my imports _after_ the main function!):
79
80```haskell-top
81import System.Environment (getArgs)
82import System.Exit (die)
83```
84
85Now, we move onto defining `process`:
86
87== Step 2: The processor
88
89`process` does a bit of IO to read from the input file, remove comments, and write to the output file, `removeComments` is a pure function however:
90
91```haskell
92process :: FilePath -> FilePath -> IO ()
93process infile outfile = do
94 ls <- lines <$> readFile infile
95 writeFile outfile $ unlines $ removeComments ls
96```
97
98== Step 3: Removing comments
99
100We will be iterating over lines in the file, and wiping clean those lines that are not Haskell. To do so, we must track some state as we will be jumping in and out of code fences:
101
102```haskell
103data State
104 = OutsideCode
105 | InHaskell
106 | InHaskellTop
107 deriving (Eq, Show)
108```
109
110To detect the code fences itself, we can define a few matcher functions, here is one for the #raw("\u{0060}\u{0060}\u{0060}haskell") pattern:
111
112```haskell
113withTag :: (String -> Bool) -> String -> Bool
114withTag pred line = length ticks > 2 && pred tag
115 where (ticks, tag) = span (== '`') line
116
117isHaskell :: String -> Bool
118isHaskell = withTag (== "haskell")
119```
120
121You will notice that this will also match #raw("\u{0060}\u{0060}\u{0060}\u{0060}haskell"), and this is intentional. If your text already contains 3 backticks inside it, you will need 4 backticks in the code fence and so on.
122
123We do the same exercise for `haskell-top`:
124
125```haskell
126isHaskellTop = withTag (== "haskell-top")
127```
128
129And for the closing code fences:
130
131```haskell
132isCodeEnd = withTag null
133```
134
135`removeComments` itself, is just a filter, that takes a list of lines and removes comments from those lines:
136
137```haskell
138removeComments :: [String] -> [String]
139removeComments ls = go OutsideCode ls [] []
140```
141
142Finally, `go` is a recursive function that starts with some `State`, a list of input lines, and two more empty lists that are used to store the lines of code that go at the top (using the `haskell-top` directive), and the ones that go below, using the `haskell` directive:
143
144```haskell
145go :: State -> [String] -> [String] -> [String] -> [String]
146```
147
148When the input file is empty, we just combine the `top` and `bottom` stacks of lines to form the file:
149
150```haskell
151go _ [] top bot = reverse top ++ reverse bot
152```
153
154Next, whenever, we are `OutsideCode`, and the current line contains a directive, we must update the state to enter a code block:
155
156```haskell
157go OutsideCode (x : rest) top bot
158 | isHaskellTop x = go InHaskellTop rest top ("" : bot)
159 | isHaskell x = go InHaskell rest top ("" : bot)
160 | otherwise = go OutsideCode rest top ("" : bot)
161```
162
163When we are already inside a Haskell code block, encountering a triple-tick should exit the code block, and any other line encountered in the block is to be included in the final file, but below the imports:
164
165```haskell
166go InHaskell (x : rest) top bot
167 | isCodeEnd x = go OutsideCode rest top ("" : bot)
168 | otherwise = go InHaskell rest top (x : bot)
169```
170
171And similarly, for blocks that start with the `haskell-top` directive, lines encountered here go into the `top` stack:
172
173```haskell
174go InHaskellTop (x : rest) top bot
175 | isCodeEnd x = go OutsideCode rest top ("" : bot)
176 | otherwise = go InHaskellTop rest (x : top) bot
177```
178
179And thats it! Gently tap the baking pan against the table and let your code settle. Once it is set, you can compile the preprocessor like so:
180
181```bash
182ghc -o typst-unlit typst-unlit.hs
183```
184
185And now, we can execute our preprocessor on literate Haskell files!
186
187#pagebreak()
188
189= Serving
190
191To test our preprocessor, first, write a literate Haskell file containing your typst code:
192
193````typst
194 = Quicksort in Haskell
195 The first thing to know about Haskell's syntax is that parentheses
196 are used for grouping, and not for function application.
197
198 ```haskell
199 quicksort :: Ord a => [a] -> [a]
200 quicksort [] = []
201 quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
202 where
203 lesser = filter (< p) xs
204 greater = filter (>= p) xs
205 ```
206
207 The parentheses indicate the grouping of operands on the
208 right-hand side of equations.
209````
210
211Remember to save that as a `.lhs` file, say `quicksort.lhs`. Now you can compile it with both `ghc` ...
212
213```bash
214ghci -pgmL ./typst-unlit quicksort.lhs
215GHCi, version 9.10.3: https://www.haskell.org/ghc/ :? for help
216[1 of 2] Compiling Main ( quicksort.lhs, interpreted )
217Ok, one module loaded.
218ghci> quicksort [3,2,4,1,5,4]
219[1,2,3,4,4,5]
220```
221
222... and `typst`:
223
224```bash
225typst compile quicksort.lhs
226```
227
228And there you have it! One file that can be interpreted by `ghc` and rendered beautifully with `typst` simultaneously.
229
230=== Notes
231
232This entire document is just a bit of ceremony around writing preprocessors, the Haskell code in this file can be summarized in this shell script:
233
234```bash
235#!/usr/bin/env bash
236
237# this does the same thing as typst-unlit.lhs, but depends on typst and jq
238# this script does clobber the line numbers, so users beware
239
240typst query "$3" 'raw.where(lang: "haskell-top")' | jq -r '.[].text' > "$4"
241typst query "$3" 'raw.where(lang: "haskell")' | jq -r '.[].text' >> "$4"
242```
243
244This document mentions the word "Haskell" 60 times.