cli + tui to publish to leaflet (wip) & manage tasks, notes & watch/read lists 馃崈
charm leaflet readability golang
at main 54 lines 2.2 kB view raw view rendered
1--- 2title: Article Overview 3sidebar_label: Overview 4description: How the article parser saves content for offline reading. 5sidebar_position: 1 6--- 7 8# Article Overview 9 10The `noteleaf article` command turns any supported URL into two files on disk: 11 12- A clean Markdown document (great for terminal reading). 13- A rendered HTML copy (handy for rich export or sharing). 14 15Both files live inside your configured `articles_dir` (defaults to `<data_dir>/articles`). The SQLite database stores the metadata and file paths so you can query, list, and delete articles without worrying about directories. 16 17## How Parsing Works 18 191. **Domain rules first**: Each supported site has a small XPath rule file (`internal/articles/rules/*.txt`). 202. **Heuristic fallback**: When no rule exists, the parser falls back to the readability-style heuristic extractor that scores DOM nodes, removes nav bars, and preserves headings/links. 213. **Metadata extraction**: The parser also looks for OpenGraph/JSON-LD tags to capture author names and publish dates. 22 23You can see the currently loaded rule set by running: 24 25```sh 26noteleaf article --help 27``` 28 29The help output prints the supported domains and the storage directory that is currently in use. 30 31## Saved Metadata 32 33Every article record contains: 34 35- URL and canonical title 36- Author (if present in metadata) 37- Publication date (stored as plain text, e.g., `2024-01-02`) 38- Markdown file path 39- HTML file path 40- Created/modified timestamps 41 42These fields make it easy to build reading logs, cite sources in notes, or reference articles from tasks. 43 44## Commands at a Glance 45 46| Command | Purpose | 47|----------------------------------|---------| 48| `noteleaf article add <url>` | Parse, save, and index a URL | 49| `noteleaf article list [query]` | Show saved items; filter with `--author` or `--limit` | 50| `noteleaf article view <id>` | Inspect metadata + a short preview | 51| `noteleaf article read <id>` | Render the Markdown nicely in your terminal | 52| `noteleaf article remove <id>` | Delete the DB entry and the files | 53 54The CLI automatically prevents duplicate imports by checking the URL before parsing.