cli + tui to publish to leaflet (wip) & manage tasks, notes & watch/read lists 馃崈
charm
leaflet
readability
golang
1---
2title: Article Overview
3sidebar_label: Overview
4description: How the article parser saves content for offline reading.
5sidebar_position: 1
6---
7
8# Article Overview
9
10The `noteleaf article` command turns any supported URL into two files on disk:
11
12- A clean Markdown document (great for terminal reading).
13- A rendered HTML copy (handy for rich export or sharing).
14
15Both files live inside your configured `articles_dir` (defaults to `<data_dir>/articles`). The SQLite database stores the metadata and file paths so you can query, list, and delete articles without worrying about directories.
16
17## How Parsing Works
18
191. **Domain rules first**: Each supported site has a small XPath rule file (`internal/articles/rules/*.txt`).
202. **Heuristic fallback**: When no rule exists, the parser falls back to the readability-style heuristic extractor that scores DOM nodes, removes nav bars, and preserves headings/links.
213. **Metadata extraction**: The parser also looks for OpenGraph/JSON-LD tags to capture author names and publish dates.
22
23You can see the currently loaded rule set by running:
24
25```sh
26noteleaf article --help
27```
28
29The help output prints the supported domains and the storage directory that is currently in use.
30
31## Saved Metadata
32
33Every article record contains:
34
35- URL and canonical title
36- Author (if present in metadata)
37- Publication date (stored as plain text, e.g., `2024-01-02`)
38- Markdown file path
39- HTML file path
40- Created/modified timestamps
41
42These fields make it easy to build reading logs, cite sources in notes, or reference articles from tasks.
43
44## Commands at a Glance
45
46| Command | Purpose |
47|----------------------------------|---------|
48| `noteleaf article add <url>` | Parse, save, and index a URL |
49| `noteleaf article list [query]` | Show saved items; filter with `--author` or `--limit` |
50| `noteleaf article view <id>` | Inspect metadata + a short preview |
51| `noteleaf article read <id>` | Render the Markdown nicely in your terminal |
52| `noteleaf article remove <id>` | Delete the DB entry and the files |
53
54The CLI automatically prevents duplicate imports by checking the URL before parsing.