Script for easily configuring, using, switching and comparing local offline coding models
at main 81 lines 3.2 kB view raw view rendered
1# localcode 2 3Fully offline AI coding environment for macOS Apple Silicon. Uses **Ollama** to serve local models with **10 terminal coding agents** — no API keys, no cloud, no costs. 4 5## Quick Start 6 7```bash 8npm run build 9npm run dev -- setup # installs Ollama, pulls models, sets up configs 10localcode # launch with defaults 11``` 12 13## Usage 14 15```bash 16localcode # launch with last-used TUI + model 17localcode goose # launch with Goose 18localcode claude # launch with Claude Code 19localcode gpt-oss # launch with GPT-OSS model 20localcode goose gpt-oss # launch with both overrides 21 22localcode models # list available models 23localcode tuis # list available TUIs 24localcode status # show config + Ollama health 25 26localcode start # start Ollama + pull models 27localcode stop # stop Ollama 28localcode bench # benchmark the active chat model 29localcode pipe "add type hints" # pipe stdin through the model 30``` 31 32TUI and model names are auto-detected — just type the id directly. Last-used choices are saved for next time. 33 34## TUIs 35 36TUIs with `ollama launch` support are installed and configured automatically by Ollama. 37 38| TUI | Launch | Method | 39|-----|--------|--------| 40| Claude Code | `localcode claude` | ollama launch | 41| Codex CLI | `localcode codex` | ollama launch | 42| OpenCode | `localcode opencode` | ollama launch | 43| Pi | `localcode pi` | ollama launch | 44| Cline | `localcode cline` | ollama launch | 45| Droid | `localcode droid` | ollama launch | 46| OpenClaw | `localcode openclaw` | ollama launch | 47| Aider | `localcode aider` | direct (env vars) | 48| Goose | `localcode goose` | direct (env vars) | 49| gptme | `localcode gptme` | direct (env vars) | 50 51## Models 52 53| Model | ID | Size | Tool Calling | Notes | 54|-------|----|------|-------------|-------| 55| Qwen3 Coder 30B-A3B | `qwen3-coder` | 19 GB | Yes | Best coding benchmarks (SWE-bench 69.6) | 56| GLM-4.7 Flash 30B-A3B | `glm-flash` | 19 GB | Yes | Strong coding, 198K context | 57| GPT-OSS 20B | `gpt-oss` | 14 GB | Yes | Lightest with tool support, good for 32GB machines | 58| Qwen 2.5 Coder 32B | `qwen-32b-chat` | 20 GB | No | Dense model, no structured tool calls | 59| Qwen 2.5 Coder 14B | `qwen-14b-chat` | 9 GB | No | Smallest chat model | 60| Qwen 2.5 Coder 7B | `qwen-7b-chat` | 5 GB | No | Lightest chat model | 61| Qwen 2.5 Coder 1.5B | `qwen-1.5b-autocomplete` | 1 GB | No | Autocomplete only | 62 63Models without tool calling will output raw tool-call text in agents that rely on structured tool use (Goose, Pi, etc.). Use gpt-oss, qwen3-coder, or glm-flash for those agents. 64 65## Hardware Requirements 66 67- **Mac with Apple Silicon** (M1/M2/M3/M4) 68- **32 GB RAM** recommended 69- Disk space depends on models pulled 70 71## Configuration 72 73All stored in `~/.config/localcode/config.json` — just the active model, autocomplete model, and TUI ids. Ollama manages model storage. 74 75## Uninstall 76 77```bash 78brew uninstall ollama # remove Ollama + all pulled models 79rm ~/.local/bin/localcode # remove CLI wrapper 80rm -rf ~/.config/localcode # remove config 81```