localcode#
Fully offline AI coding environment for macOS Apple Silicon. Uses Ollama to serve local models with 10 terminal coding agents — no API keys, no cloud, no costs.
Quick Start#
npm run build
npm run dev -- setup # installs Ollama, pulls models, sets up configs
localcode # launch with defaults
Usage#
localcode # launch with last-used TUI + model
localcode goose # launch with Goose
localcode claude # launch with Claude Code
localcode gpt-oss # launch with GPT-OSS model
localcode goose gpt-oss # launch with both overrides
localcode models # list available models
localcode tuis # list available TUIs
localcode status # show config + Ollama health
localcode start # start Ollama + pull models
localcode stop # stop Ollama
localcode bench # benchmark the active chat model
localcode pipe "add type hints" # pipe stdin through the model
TUI and model names are auto-detected — just type the id directly. Last-used choices are saved for next time.
TUIs#
TUIs with ollama launch support are installed and configured automatically by Ollama.
| TUI | Launch | Method |
|---|---|---|
| Claude Code | localcode claude |
ollama launch |
| Codex CLI | localcode codex |
ollama launch |
| OpenCode | localcode opencode |
ollama launch |
| Pi | localcode pi |
ollama launch |
| Cline | localcode cline |
ollama launch |
| Droid | localcode droid |
ollama launch |
| OpenClaw | localcode openclaw |
ollama launch |
| Aider | localcode aider |
direct (env vars) |
| Goose | localcode goose |
direct (env vars) |
| gptme | localcode gptme |
direct (env vars) |
Models#
| Model | ID | Size | Tool Calling | Notes |
|---|---|---|---|---|
| Qwen3 Coder 30B-A3B | qwen3-coder |
19 GB | Yes | Best coding benchmarks (SWE-bench 69.6) |
| GLM-4.7 Flash 30B-A3B | glm-flash |
19 GB | Yes | Strong coding, 198K context |
| GPT-OSS 20B | gpt-oss |
14 GB | Yes | Lightest with tool support, good for 32GB machines |
| Qwen 2.5 Coder 32B | qwen-32b-chat |
20 GB | No | Dense model, no structured tool calls |
| Qwen 2.5 Coder 14B | qwen-14b-chat |
9 GB | No | Smallest chat model |
| Qwen 2.5 Coder 7B | qwen-7b-chat |
5 GB | No | Lightest chat model |
| Qwen 2.5 Coder 1.5B | qwen-1.5b-autocomplete |
1 GB | No | Autocomplete only |
Models without tool calling will output raw tool-call text in agents that rely on structured tool use (Goose, Pi, etc.). Use gpt-oss, qwen3-coder, or glm-flash for those agents.
Hardware Requirements#
- Mac with Apple Silicon (M1/M2/M3/M4)
- 32 GB RAM recommended
- Disk space depends on models pulled
Configuration#
All stored in ~/.config/localcode/config.json — just the active model, autocomplete model, and TUI ids. Ollama manages model storage.
Uninstall#
brew uninstall ollama # remove Ollama + all pulled models
rm ~/.local/bin/localcode # remove CLI wrapper
rm -rf ~/.config/localcode # remove config