write literate haskell programs in typst cdn.oppi.li/typst-unlit.pdf
haskell typst
at main 8.7 kB view raw
1#set document(title: [Typst-Unlit]) 2#set par(justify: true) 3#show raw.where(lang: "haskell"): set align(center) 4#show raw.where(lang: "haskell-top"): set align(center) 5#show title: set align(center) 6#show <subtitle>: set align(center) 7 8#title() 9 10Write literate Haskell programs in Typst <subtitle> 11 12_#link("https://tangled.org/@oppi.li/typst-unlit")[tangled.org/\@oppi.li/typst-unlit]_ <subtitle> 13 14*Serves: 1 #h(20pt) Prep Time: 10min #h(20pt) Compile Time: 10ms* <subtitle> 15 16A literate program is one where comments are first-class citizens, and code is explicitly demarcated, as opposed to a regular program, where comments are explicitly marked, and code is a first-class entity. 17 18GHC supports literate programming out of the box, by using a preprocessor to extract code from documents. This preprocessor is known as _unlit_ #footnote[https://gitlab.haskell.org/ghc/ghc/-/tree/master/utils/unlit]. GHC also supports _custom_ preprocessors, which can be passed in via the `-pgmL` flag. This very document you are reading, is one such preprocessor program that allows extracting Haskell from Typst code (although it has been rendered to HTML, PDF or markdown depending on where you are reading it)#footnote[This document needs itself to compile itself! This is why a bootstrap program is included.]! 19 20This recipe not only gives you a fish (the typst-unlit preprocessor), but also, teaches you how to fish (write your own preprocessors). 21 22= Ingredients 23 24#table( 25 columns: (1fr, 1fr), 26 gutter: 3pt, 27 stroke: none, 28 table.cell(inset: 10pt)[ 29 To write your own preprocessor: 30 - GHC: the Glorious Haskell Compiler 31 - Typst: to generate PDFs 32 - And thats it! No stacking, shaking or caballing here. 33 ], 34 table.cell(inset: 10pt)[ 35 To compile this very document: 36 - The bootstrap program 37 - GHC: to produce an executable program 38 - Typst: to produce a readable PDF 39 ], 40) 41 42*Pro Tip:* If you're missing any ingredients, your local nixpkgs should stock them! 43 44= Instructions 45 46The idea behind the unlit program is super simple: iterate over the lines in the supplied input file and replace lines that aren't Haskell with an empty line! To detect lines that are Haskell, we look for the #raw("\u{0060}\u{0060}\u{0060}haskell") directive and stop at the end of the code fence. Simple enough! Annoyingly, Haskell requires that imports be declared at the top of the file. This results in literate Haskell programs always starting with a giant block of imports: 47 48#set quote(block: true) 49#quote(attribution: [Every literate programmer])[ 50``` 51 -- So first we need to get some boilerplate and imports out of the way. 52``` 53] 54 55Oh gee, if only we had a tool to put the important stuff first. Our preprocessor will remedy this wart, with the `haskell-top` directive to move blocks to the top. With that out of the way, lets move onto the program itself! 56 57#pagebreak() 58 59== Step 1: The maincourse 60 61I prefer starting with `main` but you do you. Any program that is passed to `ghc -pgmL` has to accept exactly 4 arguments: 62 63- `-h`: ignore this for now 64- `<label>`: ignore this for now 65- `<infile>`: the input lhaskell source code 66- `<outfile>`: the output Haskell source code 67 68Invoke the runes to handle CLI arguments: 69 70```haskell 71main = do 72 args <- getArgs 73 case args of 74 ["-h", _label, infile, outfile] -> process infile outfile 75 _ -> die "Usage: typst-unlit -h <label> <source> <destination>" 76``` 77 78You will need these imports accordingly (notice how I am writing my imports _after_ the main function!): 79 80```haskell-top 81import System.Environment (getArgs) 82import System.Exit (die) 83``` 84 85Now, we move onto defining `process`: 86 87== Step 2: The processor 88 89`process` does a bit of IO to read from the input file, remove comments, and write to the output file, `removeComments` is a pure function however: 90 91```haskell 92process :: FilePath -> FilePath -> IO () 93process infile outfile = do 94 ls <- lines <$> readFile infile 95 writeFile outfile $ unlines $ removeComments ls 96``` 97 98== Step 3: Removing comments 99 100We will be iterating over lines in the file, and wiping clean those lines that are not Haskell. To do so, we must track some state as we will be jumping in and out of code fences: 101 102```haskell 103data State 104 = OutsideCode 105 | InHaskell 106 | InHaskellTop 107 deriving (Eq, Show) 108``` 109 110To detect the code fences itself, we can define a few matcher functions, here is one for the #raw("\u{0060}\u{0060}\u{0060}haskell") pattern: 111 112```haskell 113withTag :: (String -> Bool) -> String -> Bool 114withTag pred line = length ticks > 2 && pred tag 115 where (ticks, tag) = span (== '`') line 116 117isHaskell :: String -> Bool 118isHaskell = withTag (== "haskell") 119``` 120 121You will notice that this will also match #raw("\u{0060}\u{0060}\u{0060}\u{0060}haskell"), and this is intentional. If your text already contains 3 backticks inside it, you will need 4 backticks in the code fence and so on. 122 123We do the same exercise for `haskell-top`: 124 125```haskell 126isHaskellTop = withTag (== "haskell-top") 127``` 128 129And for the closing code fences: 130 131```haskell 132isCodeEnd = withTag null 133``` 134 135`removeComments` itself, is just a filter, that takes a list of lines and removes comments from those lines: 136 137```haskell 138removeComments :: [String] -> [String] 139removeComments ls = go OutsideCode ls [] [] 140``` 141 142Finally, `go` is a recursive function that starts with some `State`, a list of input lines, and two more empty lists that are used to store the lines of code that go at the top (using the `haskell-top` directive), and the ones that go below, using the `haskell` directive: 143 144```haskell 145go :: State -> [String] -> [String] -> [String] -> [String] 146``` 147 148When the input file is empty, we just combine the `top` and `bottom` stacks of lines to form the file: 149 150```haskell 151go _ [] top bot = reverse top ++ reverse bot 152``` 153 154Next, whenever, we are `OutsideCode`, and the current line contains a directive, we must update the state to enter a code block: 155 156```haskell 157go OutsideCode (x : rest) top bot 158 | isHaskellTop x = go InHaskellTop rest top ("" : bot) 159 | isHaskell x = go InHaskell rest top ("" : bot) 160 | otherwise = go OutsideCode rest top ("" : bot) 161``` 162 163When we are already inside a Haskell code block, encountering a triple-tick should exit the code block, and any other line encountered in the block is to be included in the final file, but below the imports: 164 165```haskell 166go InHaskell (x : rest) top bot 167 | isCodeEnd x = go OutsideCode rest top ("" : bot) 168 | otherwise = go InHaskell rest top (x : bot) 169``` 170 171And similarly, for blocks that start with the `haskell-top` directive, lines encountered here go into the `top` stack: 172 173```haskell 174go InHaskellTop (x : rest) top bot 175 | isCodeEnd x = go OutsideCode rest top ("" : bot) 176 | otherwise = go InHaskellTop rest (x : top) bot 177``` 178 179And thats it! Gently tap the baking pan against the table and let your code settle. Once it is set, you can compile the preprocessor like so: 180 181```bash 182ghc -o typst-unlit typst-unlit.hs 183``` 184 185And now, we can execute our preprocessor on literate Haskell files! 186 187#pagebreak() 188 189= Serving 190 191To test our preprocessor, first, write a literate Haskell file containing your typst code: 192 193````typst 194 = Quicksort in Haskell 195 The first thing to know about Haskell's syntax is that parentheses 196 are used for grouping, and not for function application. 197 198 ```haskell 199 quicksort :: Ord a => [a] -> [a] 200 quicksort [] = [] 201 quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater) 202 where 203 lesser = filter (< p) xs 204 greater = filter (>= p) xs 205 ``` 206 207 The parentheses indicate the grouping of operands on the 208 right-hand side of equations. 209```` 210 211Remember to save that as a `.lhs` file, say `quicksort.lhs`. Now you can compile it with both `ghc` ... 212 213```bash 214ghci -pgmL ./typst-unlit quicksort.lhs 215GHCi, version 9.10.3: https://www.haskell.org/ghc/ :? for help 216[1 of 2] Compiling Main ( quicksort.lhs, interpreted ) 217Ok, one module loaded. 218ghci> quicksort [3,2,4,1,5,4] 219[1,2,3,4,4,5] 220``` 221 222... and `typst`: 223 224```bash 225typst compile quicksort.lhs 226``` 227 228And there you have it! One file that can be interpreted by `ghc` and rendered beautifully with `typst` simultaneously. 229 230=== Notes 231 232This entire document is just a bit of ceremony around writing preprocessors, the Haskell code in this file can be summarized in this shell script: 233 234```bash 235#!/usr/bin/env bash 236 237# this does the same thing as typst-unlit.lhs, but depends on typst and jq 238# this script does clobber the line numbers, so users beware 239 240typst query "$3" 'raw.where(lang: "haskell-top")' | jq -r '.[].text' > "$4" 241typst query "$3" 'raw.where(lang: "haskell")' | jq -r '.[].text' >> "$4" 242``` 243 244This document mentions the word "Haskell" 60 times.