Script for easily configuring, using, switching and comparing local offline coding models
1# localcode
2
3Fully offline AI coding environment for macOS Apple Silicon. Uses **Ollama** to serve local models with **10 terminal coding agents** — no API keys, no cloud, no costs.
4
5## Quick Start
6
7```bash
8npm run build
9npm run dev -- setup # installs Ollama, pulls models, sets up configs
10localcode # launch with defaults
11```
12
13## Usage
14
15```bash
16localcode # launch with last-used TUI + model
17localcode goose # launch with Goose
18localcode claude # launch with Claude Code
19localcode gpt-oss # launch with GPT-OSS model
20localcode goose gpt-oss # launch with both overrides
21
22localcode models # list available models
23localcode tuis # list available TUIs
24localcode status # show config + Ollama health
25
26localcode start # start Ollama + pull models
27localcode stop # stop Ollama
28localcode bench # benchmark the active chat model
29localcode pipe "add type hints" # pipe stdin through the model
30```
31
32TUI and model names are auto-detected — just type the id directly. Last-used choices are saved for next time.
33
34## TUIs
35
36TUIs with `ollama launch` support are installed and configured automatically by Ollama.
37
38| TUI | Launch | Method |
39|-----|--------|--------|
40| Claude Code | `localcode claude` | ollama launch |
41| Codex CLI | `localcode codex` | ollama launch |
42| OpenCode | `localcode opencode` | ollama launch |
43| Pi | `localcode pi` | ollama launch |
44| Cline | `localcode cline` | ollama launch |
45| Droid | `localcode droid` | ollama launch |
46| OpenClaw | `localcode openclaw` | ollama launch |
47| Aider | `localcode aider` | direct (env vars) |
48| Goose | `localcode goose` | direct (env vars) |
49| gptme | `localcode gptme` | direct (env vars) |
50
51## Models
52
53| Model | ID | Size | Tool Calling | Notes |
54|-------|----|------|-------------|-------|
55| Qwen3 Coder 30B-A3B | `qwen3-coder` | 19 GB | Yes | Best coding benchmarks (SWE-bench 69.6) |
56| GLM-4.7 Flash 30B-A3B | `glm-flash` | 19 GB | Yes | Strong coding, 198K context |
57| GPT-OSS 20B | `gpt-oss` | 14 GB | Yes | Lightest with tool support, good for 32GB machines |
58| Qwen 2.5 Coder 32B | `qwen-32b-chat` | 20 GB | No | Dense model, no structured tool calls |
59| Qwen 2.5 Coder 14B | `qwen-14b-chat` | 9 GB | No | Smallest chat model |
60| Qwen 2.5 Coder 7B | `qwen-7b-chat` | 5 GB | No | Lightest chat model |
61| Qwen 2.5 Coder 1.5B | `qwen-1.5b-autocomplete` | 1 GB | No | Autocomplete only |
62
63Models without tool calling will output raw tool-call text in agents that rely on structured tool use (Goose, Pi, etc.). Use gpt-oss, qwen3-coder, or glm-flash for those agents.
64
65## Hardware Requirements
66
67- **Mac with Apple Silicon** (M1/M2/M3/M4)
68- **32 GB RAM** recommended
69- Disk space depends on models pulled
70
71## Configuration
72
73All stored in `~/.config/localcode/config.json` — just the active model, autocomplete model, and TUI ids. Ollama manages model storage.
74
75## Uninstall
76
77```bash
78brew uninstall ollama # remove Ollama + all pulled models
79rm ~/.local/bin/localcode # remove CLI wrapper
80rm -rf ~/.config/localcode # remove config
81```