cli + tui to publish to leaflet (wip) & manage tasks, notes & watch/read lists 🍃
charm leaflet readability golang
at main 75 lines 2.8 kB view raw view rendered
1--- 2title: Article Management 3sidebar_label: Management 4description: Save URLs, inspect metadata, and read articles without leaving the CLI. 5sidebar_position: 2 6--- 7 8# Article Management 9 10## Save Articles from URLs 11 12```sh 13noteleaf article add https://example.com/long-form-piece 14``` 15 16What happens: 17 181. The CLI checks the database to ensure the URL was not imported already. 192. It fetches the page with a reader-friendly User-Agent (`curl/8.4.0`) and English `Accept-Language` headers to avoid blocked responses. 203. Parsed content is written to Markdown and HTML files under `articles_dir`. 214. A database record is inserted with all metadata and file paths. 22 23If parsing fails (unsupported domain, network issue, etc.) nothing is written to disk, so partial entries never appear in your archive. 24 25## Parsing and Extraction 26 27The parser uses a two-layer strategy: 28 291. **Domain-specific rules** check the XPath selectors defined in `internal/articles/rules`. These rules strip unwanted elements (cookie banners, nav bars), capture the main body, and record author/date fields accurately. 302. **Heuristic fallback** scores every DOM node, penalizes high link-density sections, and picks the most “article-like” block. It also pulls metadata from JSON-LD `Article` objects when available. 31 32During saving, the Markdown file gets a generated header: 33 34```markdown 35# Article Title 36 37**Author:** Jane Smith 38 39**Date:** 2024-01-02 40 41**Source:** https://example.com/long-form-piece 42 43**Saved:** 2024-02-05 10:45:12 44``` 45 46Everything after the `---` separator is the cleaned article content. 47 48## Reading in the Terminal 49 50There are two ways to inspect what you saved: 51 52- `noteleaf article view <id>` shows metadata, verifies whether the files still exist, and prints the first ~20 lines as a preview. 53- `noteleaf article read <id>` renders the full Markdown using [Charm’s Glamour](https://github.com/charmbracelet/glamour), giving you syntax highlighting, proper headings, and wrapped paragraphs directly in the terminal. 54 55If you prefer your editor, open the Markdown path printed by `view`. Both Markdown and HTML copies belong to you, so feel free to annotate or reformat them. 56 57## Article Metadata Reference 58 59Use `noteleaf article list` to see titles and authors: 60 61```sh 62noteleaf article list # newest first 63noteleaf article list "sqlite" # full-text filter on titles 64noteleaf article list --author "Kim" # author filter 65noteleaf article list -l 5 # top 5 results 66``` 67 68Each entry includes the created timestamp. The `view` command provides the raw paths so you can script around them, for example: 69 70```sh 71md=$(noteleaf article view 12 | rg 'Markdown:' | awk '{print $3}') 72$EDITOR "$md" 73``` 74 75All metadata lives in the SQLite `articles` table, making it easy to run your own reports with `sqlite3` if needed.