Replace llama.cpp backend with Ollama · burrito.space/localcode@e450ff3

+33 -28

CLAUDE.md

··· 4 4 5 5 ## Project Overview 6 6 7 - `localcode` — a single CLI for managing a fully offline local AI coding environment on macOS Apple Silicon. Uses llama.cpp to serve Qwen 2.5 Coder models via OpenAI-compatible APIs, with switchable terminal coding agents (Aider, OpenCode, Pi). 7 + `localcode` — a single CLI for managing a fully offline local AI coding environment on macOS Apple Silicon. Uses Ollama to serve Qwen 2.5 Coder models via OpenAI-compatible APIs, with switchable terminal coding agents (Aider, OpenCode, Pi). 8 8 9 9 ## Commands 10 10 ··· 19 19 ``` 20 20 localcode Launch active TUI in current directory 21 21 localcode status Show current config + server health 22 - localcode start Start chat + autocomplete servers 23 - localcode stop Stop all servers 24 - localcode models List available models 25 - localcode models set-chat <id> Switch chat model 26 - localcode models set-auto <id> Switch autocomplete model 27 - localcode tuis List available TUIs 28 - localcode tuis set <id> Switch active TUI 22 + localcode start Start Ollama + pull models 23 + localcode stop Stop Ollama 24 + localcode model List available models 25 + localcode set model <id> Switch the chat model 26 + localcode set autocomplete <id> Switch the autocomplete model 27 + localcode tui List available TUIs 28 + localcode set tui <id> Switch the active TUI 29 29 localcode bench Benchmark running chat model 30 30 localcode bench history Show past benchmark results 31 31 localcode pipe "prompt" Pipe stdin through the model ··· 37 37 ``` 38 38 src/ 39 39 main.ts — CLI dispatcher (switch on process.argv[2]) 40 - config.ts — Path/port constants 40 + config.ts — Ollama URL/port constants, TUI config paths 41 41 log.ts — log/warn/err with ANSI colors 42 42 util.ts — Shell exec helpers, file writers 43 43 runtime-config.ts — Read/write ~/.config/localcode/config.json 44 44 registry/ 45 - models.ts — ModelDef interface + MODELS array 45 + models.ts — ModelDef interface + MODELS array (Ollama tags) 46 46 tuis.ts — TuiDef interface + TUIS array 47 47 commands/ 48 - run.ts — Default action: ensure server, init git, exec TUI 49 - status.ts — Show config + server health 50 - server.ts — Start/stop llama.cpp servers 48 + run.ts — Default action: ensure Ollama, init git, exec TUI 49 + status.ts — Show config + Ollama health 50 + server.ts — Start/stop Ollama, pull models 51 51 setup.ts — Full install pipeline 52 - models.ts — List/switch models, auto-download + regen scripts 53 - tuis.ts — List/switch TUIs, auto-install + regen scripts 54 - bench.ts — Benchmark against running llama.cpp server 52 + models.ts — List/switch models, auto-pull + regen configs 53 + tuis.ts — List/switch TUIs, auto-install + regen configs 54 + bench.ts — Benchmark against running Ollama 55 55 pipe.ts — Pipe stdin through the model 56 - steps/ — Individual setup phases (preflight, homebrew, llama, etc.) 56 + steps/ — Individual setup phases (preflight, homebrew, ollama, etc.) 57 57 templates/ 58 - scripts.ts — Bash server launcher templates (parameterized by ModelDef) 58 + scripts.ts — localcode wrapper script 59 59 aider.ts — Aider config template 60 60 opencode.ts — OpenCode config template 61 - pi.ts — Pi models.json template 61 + pi.ts — Pi models.json + settings.json templates 62 62 ``` 63 63 64 64 ### Key patterns 65 65 66 + **Ollama backend**: Single Ollama server on port 11434 serves all models. Models identified by Ollama tags (e.g., `qwen2.5-coder:32b`). No separate chat/autocomplete server processes — Ollama loads/unloads models on demand. 67 + 66 68 **Runtime config** (`~/.config/localcode/config.json`): Stores active chatModel, autocompleteModel, and tui IDs. Read by `runtime-config.ts` with defaults fallback. 67 69 68 - **Registries**: `registry/models.ts` and `registry/tuis.ts` define available options as typed arrays. Add new models/TUIs by appending to these arrays. 70 + **Registries**: `registry/models.ts` and `registry/tuis.ts` define available options as typed arrays. Add new models/TUIs by appending to these arrays. Models have `ollamaTag` field for the Ollama model identifier. 69 71 70 - **Script regeneration**: When models or TUI are switched, launcher scripts in `~/.local/bin/` and all TUI configs are automatically regenerated. 71 - 72 - **Template escaping**: Bash templates in `src/templates/scripts.ts` use `const D = "$"` to emit literal `$` without triggering TS interpolation. 72 + **Config regeneration**: When models or TUI are switched, TUI configs are automatically regenerated. 73 73 74 - **Generated scripts**: Only 3 bash scripts are generated in `~/.local/bin/`: `localcode` (thin wrapper calling `node dist/main.js`), `llama-chat-server`, `llama-complete-server`. All other functionality lives in TypeScript commands. 74 + **Generated scripts**: Only 1 bash script is generated in `~/.local/bin/`: `localcode` (thin wrapper calling `node dist/main.js`). All other functionality lives in TypeScript commands. 75 75 76 - **Benchmark**: Hits `/v1/chat/completions` with 3 hardcoded prompts, measures wall-clock time + token counts. Results saved to `~/.config/localcode/benchmarks.json`. 76 + **Benchmark**: Hits Ollama's `/v1/chat/completions` with 3 hardcoded prompts, measures wall-clock time + token counts. Results saved to `~/.config/localcode/benchmarks.json`. 77 77 78 78 ## Key paths on the user's system 79 79 80 - - `~/.local/bin/` — `localcode` wrapper + server launcher scripts 81 - - `~/.local/share/llama-models/` — Downloaded GGUF model files 80 + - `~/.local/bin/localcode` — CLI wrapper script 82 81 - `~/.config/localcode/config.json` — Active model/TUI selection 83 82 - `~/.config/localcode/benchmarks.json` — Benchmark history 84 83 - `~/.aider/` — Aider config 85 84 - `~/.config/opencode/opencode.json` — OpenCode config 86 85 - `~/.pi/agent/models.json` — Pi config 87 - - Chat server port **8080**, autocomplete port **8081** 86 + - `~/.pi/agent/settings.json` — Pi settings (packages) 87 + - Ollama port **11434** 88 88 89 89 ## Important: after changing TypeScript 90 90 91 91 The `localcode` wrapper in `~/.local/bin/` calls `node dist/main.js`. After modifying TypeScript source, run `npm run build` to recompile, or the wrapper will run stale code. 92 + 93 + ## Dead files to clean up 94 + 95 + - `src/commands/proxy.ts` — Was the llama.cpp tool-call rewriting proxy, now unused (Ollama handles tool calling natively) 96 + - `templates/qwen-tool-call.jinja` — Was the Qwen tool-use Jinja template for llama.cpp, now unused

+14 -16

src/commands/bench.ts

··· 1 1 import { readFileSync, writeFileSync, mkdirSync } from "node:fs"; 2 - import { join, dirname } from "node:path"; 2 + import { dirname, join } from "node:path"; 3 3 import { homedir } from "node:os"; 4 4 import { performance } from "node:perf_hooks"; 5 - import { CHAT_PORT } from "../config.js"; 5 + import { OLLAMA_URL, OLLAMA_PORT } from "../config.js"; 6 6 import { getActiveChatModel } from "../runtime-config.js"; 7 - import { log, warn, err } from "../log.js"; 7 + import { log, err } from "../log.js"; 8 8 import type { ModelDef } from "../registry/models.js"; 9 9 10 10 const BOLD = "\x1b[1m"; ··· 84 84 avgTokPerSec: number; 85 85 } 86 86 87 - async function checkHealth(port: number): Promise<boolean> { 87 + async function checkHealth(): Promise<boolean> { 88 88 try { 89 - const res = await fetch(`http://127.0.0.1:${port}/health`); 89 + const res = await fetch(`${OLLAMA_URL}/api/tags`); 90 90 return res.ok; 91 91 } catch { 92 92 return false; ··· 94 94 } 95 95 96 96 async function runPrompt( 97 - port: number, 97 + model: ModelDef, 98 98 prompt: BenchPrompt, 99 99 ): Promise<PromptResult> { 100 100 const body = JSON.stringify({ 101 - model: "qwen", 101 + model: model.ollamaTag, 102 102 messages: [ 103 103 { role: "system", content: prompt.system }, 104 104 { role: "user", content: prompt.user }, ··· 108 108 109 109 const start = performance.now(); 110 110 const res = await fetch( 111 - `http://127.0.0.1:${port}/v1/chat/completions`, 111 + `${OLLAMA_URL}/v1/chat/completions`, 112 112 { 113 113 method: "POST", 114 114 headers: { "Content-Type": "application/json" }, ··· 156 156 157 157 function printResults(model: ModelDef, results: PromptResult[]): void { 158 158 console.log(""); 159 - console.log(`${BOLD}Model:${RESET} ${model.name} (${model.file})`); 160 - console.log(`${BOLD}Port:${RESET} ${CHAT_PORT}`); 159 + console.log(`${BOLD}Model:${RESET} ${model.name} (${model.ollamaTag})`); 160 + console.log(`${BOLD}Port:${RESET} ${OLLAMA_PORT}`); 161 161 console.log(""); 162 162 163 163 // Table header ··· 232 232 return; 233 233 } 234 234 235 - const healthy = await checkHealth(CHAT_PORT); 235 + const healthy = await checkHealth(); 236 236 if (!healthy) { 237 - err( 238 - `Chat server not running on port ${CHAT_PORT}.\nStart it with: llama-start`, 239 - ); 237 + err("Ollama not running. Start it with: localcode start"); 240 238 } 241 239 242 240 const model = getActiveChatModel(); 243 - log(`Benchmarking ${model.name} on port ${CHAT_PORT}...`); 241 + log(`Benchmarking ${model.name} (${model.ollamaTag})...`); 244 242 console.log(`${DIM}Running ${PROMPTS.length} prompts (this may take a minute)...${RESET}`); 245 243 246 244 const results: PromptResult[] = []; 247 245 for (const prompt of PROMPTS) { 248 246 process.stdout.write(` ${prompt.label}...`); 249 247 try { 250 - const result = await runPrompt(CHAT_PORT, prompt); 248 + const result = await runPrompt(model, prompt); 251 249 results.push(result); 252 250 console.log(` ${result.tokensPerSec.toFixed(1)} tok/s`); 253 251 } catch (e) {

+23 -29

src/commands/models.ts

··· 1 - import { existsSync } from "node:fs"; 2 - import { join } from "node:path"; 1 + import { execSync } from "node:child_process"; 3 2 import { MODELS, getChatModels, getAutocompleteModels, getModelById } from "../registry/models.js"; 4 3 import { loadConfig, saveConfig, getActiveChatModel, getActiveAutocompleteModel } from "../runtime-config.js"; 5 4 import { createLauncherScripts } from "../steps/scripts.js"; 6 5 import { writeTuiConfig } from "../steps/aider-config.js"; 7 - import { MODELS_DIR } from "../config.js"; 8 6 import { log, err } from "../log.js"; 9 7 import { runPassthrough } from "../util.js"; 10 - import { mkdirSync } from "node:fs"; 11 8 12 9 const BOLD = "\x1b[1m"; 13 10 const GREEN = "\x1b[0;32m"; 14 11 const DIM = "\x1b[2m"; 15 12 const RESET = "\x1b[0m"; 16 13 17 - function isDownloaded(file: string): boolean { 18 - return existsSync(join(MODELS_DIR, file)); 14 + function isPulled(ollamaTag: string): boolean { 15 + try { 16 + const output = execSync("ollama list", { encoding: "utf-8", stdio: ["pipe", "pipe", "pipe"] }); 17 + return output.includes(ollamaTag); 18 + } catch { 19 + return false; 20 + } 19 21 } 20 22 21 23 export function listModels(): void { ··· 25 27 console.log(`\n${BOLD}Chat models:${RESET}`); 26 28 for (const m of getChatModels()) { 27 29 const active = m.id === activeChatId ? ` ${GREEN}<- active${RESET}` : ""; 28 - const downloaded = isDownloaded(m.file) ? "" : ` ${DIM}(not downloaded)${RESET}`; 30 + const pulled = isPulled(m.ollamaTag) ? "" : ` ${DIM}(not pulled)${RESET}`; 29 31 console.log( 30 - ` ${BOLD}${m.id}${RESET} ${m.name} ~${m.sizeGB}GB min ${m.minRAMGB}GB RAM${active}${downloaded}`, 32 + ` ${BOLD}${m.id}${RESET} ${m.name} ${DIM}${m.ollamaTag}${RESET}${active}${pulled}`, 31 33 ); 32 34 } 33 35 34 36 console.log(`\n${BOLD}Autocomplete models:${RESET}`); 35 37 for (const m of getAutocompleteModels()) { 36 38 const active = m.id === activeAutoId ? ` ${GREEN}<- active${RESET}` : ""; 37 - const downloaded = isDownloaded(m.file) ? "" : ` ${DIM}(not downloaded)${RESET}`; 39 + const pulled = isPulled(m.ollamaTag) ? "" : ` ${DIM}(not pulled)${RESET}`; 38 40 console.log( 39 - ` ${BOLD}${m.id}${RESET} ${m.name} ~${m.sizeGB}GB min ${m.minRAMGB}GB RAM${active}${downloaded}`, 41 + ` ${BOLD}${m.id}${RESET} ${m.name} ${DIM}${m.ollamaTag}${RESET}${active}${pulled}`, 40 42 ); 41 43 } 42 44 console.log(""); ··· 56 58 saveConfig(config); 57 59 log(`Chat model set to ${model.name}`); 58 60 59 - // Download if needed 60 - const dest = join(MODELS_DIR, model.file); 61 - if (!existsSync(dest)) { 62 - mkdirSync(MODELS_DIR, { recursive: true }); 63 - log(`Downloading ${model.name} (${model.sizeGB}GB)...`); 64 - runPassthrough(`curl -L --progress-bar -o "${dest}" "${model.url}"`); 65 - log(`Downloaded: ${model.file}`); 61 + // Pull if needed 62 + if (!isPulled(model.ollamaTag)) { 63 + log(`Pulling ${model.name} (${model.ollamaTag})...`); 64 + runPassthrough(`ollama pull ${model.ollamaTag}`); 66 65 } 67 66 68 - // Regenerate scripts and configs 67 + // Regenerate configs 69 68 await createLauncherScripts(); 70 69 await writeTuiConfig(); 71 - log("Launcher scripts and configs regenerated."); 72 - log("Run llama-stop && llama-start to use the new model."); 70 + log("Configs regenerated."); 73 71 } 74 72 75 73 export async function setAutocompleteModel(id: string): Promise<void> { ··· 86 84 saveConfig(config); 87 85 log(`Autocomplete model set to ${model.name}`); 88 86 89 - // Download if needed 90 - const dest = join(MODELS_DIR, model.file); 91 - if (!existsSync(dest)) { 92 - mkdirSync(MODELS_DIR, { recursive: true }); 93 - log(`Downloading ${model.name} (${model.sizeGB}GB)...`); 94 - runPassthrough(`curl -L --progress-bar -o "${dest}" "${model.url}"`); 95 - log(`Downloaded: ${model.file}`); 87 + // Pull if needed 88 + if (!isPulled(model.ollamaTag)) { 89 + log(`Pulling ${model.name} (${model.ollamaTag})...`); 90 + runPassthrough(`ollama pull ${model.ollamaTag}`); 96 91 } 97 92 98 93 await createLauncherScripts(); 99 94 await writeTuiConfig(); 100 - log("Launcher scripts and configs regenerated."); 101 - log("Run llama-stop && llama-start to use the new model."); 95 + log("Configs regenerated."); 102 96 }

+7 -4

src/commands/pipe.ts

··· 1 - import { CHAT_PORT } from "../config.js"; 1 + import { OLLAMA_URL } from "../config.js"; 2 + import { getActiveChatModel } from "../runtime-config.js"; 2 3 import { err } from "../log.js"; 3 4 4 5 export async function runPipe(prompt: string): Promise<void> { 6 + const model = getActiveChatModel(); 7 + 5 8 // Read stdin 6 9 const chunks: Buffer[] = []; 7 10 for await (const chunk of process.stdin) { ··· 10 13 const input = Buffer.concat(chunks).toString("utf-8"); 11 14 12 15 const body = JSON.stringify({ 13 - model: "qwen", 16 + model: model.ollamaTag, 14 17 messages: [ 15 18 { 16 19 role: "system", ··· 25 28 let res: Response; 26 29 try { 27 30 res = await fetch( 28 - `http://127.0.0.1:${CHAT_PORT}/v1/chat/completions`, 31 + `${OLLAMA_URL}/v1/chat/completions`, 29 32 { 30 33 method: "POST", 31 34 headers: { "Content-Type": "application/json" }, ··· 33 36 }, 34 37 ); 35 38 } catch { 36 - err(`Chat server not running on port ${CHAT_PORT}.\nStart it with: localcode start`); 39 + err("Ollama not running. Start it with: localcode start"); 37 40 } 38 41 39 42 if (!res!.ok) {

+2

src/commands/proxy.ts

··· 1 + // This file is unused — Ollama handles tool calling natively. 2 + // Safe to delete.

+23 -66

src/commands/run.ts

··· 1 1 import { spawn, execSync } from "node:child_process"; 2 2 import { existsSync } from "node:fs"; 3 - import { join } from "node:path"; 4 - import { CHAT_PORT, LAUNCH_DIR, AIDER_ENV_FILE } from "../config.js"; 5 - import { getActiveTui } from "../runtime-config.js"; 6 - import { log, warn } from "../log.js"; 7 - 8 - async function waitForHealth( 9 - port: number, 10 - timeoutSec: number, 11 - ): Promise<boolean> { 12 - for (let i = 0; i < timeoutSec; i++) { 13 - try { 14 - const res = await fetch(`http://127.0.0.1:${port}/health`); 15 - if (res.ok) return true; 16 - } catch { 17 - // not ready yet 18 - } 19 - await new Promise((r) => setTimeout(r, 2000)); 20 - process.stdout.write("."); 21 - } 22 - return false; 23 - } 24 - 25 - async function ensureServer(): Promise<void> { 26 - try { 27 - const res = await fetch(`http://127.0.0.1:${CHAT_PORT}/health`); 28 - if (res.ok) return; 29 - } catch { 30 - // not running 31 - } 32 - 33 - const serverScript = join(LAUNCH_DIR, "llama-chat-server"); 34 - if (!existsSync(serverScript)) { 35 - warn("Server script not found. Run: localcode setup"); 36 - process.exit(1); 37 - } 38 - 39 - log("Starting llama.cpp chat server..."); 40 - const child = spawn(serverScript, [], { 41 - stdio: "ignore", 42 - detached: true, 43 - }); 44 - child.unref(); 45 - 46 - process.stdout.write("Waiting for model to load"); 47 - const ready = await waitForHealth(CHAT_PORT, 120); 48 - if (ready) { 49 - console.log(" ready!"); 50 - } else { 51 - console.log(" timed out."); 52 - warn("Server may still be loading. Check /tmp/llama-chat.log"); 53 - } 54 - } 3 + import { OLLAMA_PORT } from "../config.js"; 4 + import { getActiveTui, getActiveChatModel } from "../runtime-config.js"; 5 + import { log } from "../log.js"; 6 + import { ensureOllama } from "./server.js"; 55 7 56 8 function ensureGit(): void { 57 9 if (!existsSync(".git")) { ··· 65 17 } 66 18 67 19 export async function runDefault(args: string[]): Promise<void> { 68 - await ensureServer(); 20 + await ensureOllama(); 69 21 ensureGit(); 70 22 71 23 const tui = getActiveTui(); 24 + const chatModel = getActiveChatModel(); 25 + let tuiArgs: string[]; 72 26 73 - // Set up env for aider 74 - if (tui.id === "aider" && existsSync(AIDER_ENV_FILE)) { 75 - const { readFileSync } = await import("node:fs"); 76 - const env = readFileSync(AIDER_ENV_FILE, "utf-8"); 77 - for (const line of env.split("\n")) { 78 - if (line && !line.startsWith("#")) { 79 - const eq = line.indexOf("="); 80 - if (eq > 0) { 81 - process.env[line.slice(0, eq)] = line.slice(eq + 1); 82 - } 83 - } 84 - } 27 + if (tui.id === "aider") { 28 + process.env.OPENAI_API_KEY = "sk-not-needed"; 29 + process.env.OPENAI_API_BASE = `http://127.0.0.1:${OLLAMA_PORT}/v1`; 30 + tuiArgs = [ 31 + "--model", `openai/${chatModel.id}`, 32 + "--no-show-model-warnings", 33 + "--no-check-update", 34 + ...args, 35 + ]; 36 + } else { 37 + tuiArgs = args; 38 + } 39 + 40 + if (tui.resumeArgs) { 41 + tuiArgs.push(...tui.resumeArgs); 85 42 } 86 43 87 44 log(`Launching ${tui.name}...`); 88 - const child = spawn(tui.checkCmd, args, { stdio: "inherit" }); 45 + const child = spawn(tui.checkCmd, tuiArgs, { stdio: "inherit" }); 89 46 child.on("exit", (code) => process.exit(code ?? 0)); 90 47 }

+54 -63

src/commands/server.ts

··· 1 1 import { spawn, execSync } from "node:child_process"; 2 - import { existsSync } from "node:fs"; 3 - import { join } from "node:path"; 4 - import { CHAT_PORT, AUTOCOMPLETE_PORT, LAUNCH_DIR } from "../config.js"; 2 + import { OLLAMA_URL } from "../config.js"; 5 3 import { 6 4 getActiveChatModel, 7 5 getActiveAutocompleteModel, 8 6 } from "../runtime-config.js"; 9 7 import { log, warn, err } from "../log.js"; 8 + import { runPassthrough } from "../util.js"; 10 9 11 - async function waitForHealth(port: number, label: string): Promise<boolean> { 12 - process.stdout.write(` Waiting for ${label}`); 13 - for (let i = 0; i < 60; i++) { 14 - try { 15 - const res = await fetch(`http://127.0.0.1:${port}/health`); 16 - if (res.ok) { 17 - console.log(" ready!"); 18 - return true; 19 - } 20 - } catch { 21 - // not ready 10 + async function ollamaHealthy(): Promise<boolean> { 11 + try { 12 + const res = await fetch(`${OLLAMA_URL}/api/tags`); 13 + return res.ok; 14 + } catch { 15 + return false; 16 + } 17 + } 18 + 19 + async function waitForOllama(timeoutSec: number): Promise<boolean> { 20 + process.stdout.write(" Waiting for Ollama"); 21 + for (let i = 0; i < timeoutSec; i++) { 22 + if (await ollamaHealthy()) { 23 + console.log(" ready!"); 24 + return true; 22 25 } 23 26 process.stdout.write("."); 24 - await new Promise((r) => setTimeout(r, 2000)); 27 + await new Promise((r) => setTimeout(r, 1000)); 25 28 } 26 29 console.log(" timed out."); 27 30 return false; 28 31 } 29 32 30 - export async function startServers(): Promise<void> { 31 - const chatModel = getActiveChatModel(); 32 - const autoModel = getActiveAutocompleteModel(); 33 + export async function ensureOllama(): Promise<void> { 34 + if (await ollamaHealthy()) return; 33 35 34 - const chatScript = join(LAUNCH_DIR, "llama-chat-server"); 35 - const autoScript = join(LAUNCH_DIR, "llama-complete-server"); 36 + log("Starting Ollama..."); 37 + const child = spawn("ollama", ["serve"], { 38 + stdio: "ignore", 39 + detached: true, 40 + env: { ...process.env, OLLAMA_HOST: "127.0.0.1" }, 41 + }); 42 + child.unref(); 36 43 37 - if (!existsSync(chatScript)) { 38 - err("Server scripts not found. Run: localcode setup"); 44 + const ready = await waitForOllama(15); 45 + if (!ready) { 46 + err("Ollama failed to start. Run: ollama serve"); 39 47 } 40 - 41 - log(`Starting servers...`); 48 + } 42 49 43 - // Chat server 44 - let chatAlready = false; 50 + function isModelPulled(ollamaTag: string): boolean { 45 51 try { 46 - const res = await fetch(`http://127.0.0.1:${CHAT_PORT}/health`); 47 - chatAlready = res.ok; 52 + const output = execSync("ollama list", { encoding: "utf-8", stdio: ["pipe", "pipe", "pipe"] }); 53 + return output.includes(ollamaTag); 48 54 } catch { 49 - // not running 55 + return false; 50 56 } 57 + } 51 58 52 - if (chatAlready) { 53 - log(`Chat server already running on :${CHAT_PORT}`); 54 - } else { 55 - log(`Chat: ${chatModel.name} on :${CHAT_PORT}`); 56 - const c = spawn(chatScript, [], { 57 - stdio: ["ignore", "ignore", "ignore"], 58 - detached: true, 59 - }); 60 - c.unref(); 61 - await waitForHealth(CHAT_PORT, "chat model"); 59 + async function pullIfNeeded(ollamaTag: string, label: string): Promise<void> { 60 + if (isModelPulled(ollamaTag)) { 61 + log(`${label} already pulled: ${ollamaTag}`); 62 + return; 62 63 } 64 + log(`Pulling ${label}: ${ollamaTag} (this may take a while)...`); 65 + runPassthrough(`ollama pull ${ollamaTag}`); 66 + } 63 67 64 - // Autocomplete server 65 - let autoAlready = false; 66 - try { 67 - const res = await fetch(`http://127.0.0.1:${AUTOCOMPLETE_PORT}/health`); 68 - autoAlready = res.ok; 69 - } catch { 70 - // not running 71 - } 68 + export async function startServers(): Promise<void> { 69 + await ensureOllama(); 72 70 73 - if (autoAlready) { 74 - log(`Autocomplete server already running on :${AUTOCOMPLETE_PORT}`); 75 - } else { 76 - log(`Autocomplete: ${autoModel.name} on :${AUTOCOMPLETE_PORT}`); 77 - const c = spawn(autoScript, [], { 78 - stdio: ["ignore", "ignore", "ignore"], 79 - detached: true, 80 - }); 81 - c.unref(); 82 - await waitForHealth(AUTOCOMPLETE_PORT, "autocomplete model"); 83 - } 71 + const chatModel = getActiveChatModel(); 72 + const autoModel = getActiveAutocompleteModel(); 73 + 74 + await pullIfNeeded(chatModel.ollamaTag, `chat model (${chatModel.name})`); 75 + await pullIfNeeded(autoModel.ollamaTag, `autocomplete model (${autoModel.name})`); 84 76 85 - console.log(""); 86 - log("Servers running. Logs: /tmp/llama-chat.log, /tmp/llama-complete.log"); 77 + log("Ollama is running. Models are pulled and ready."); 87 78 } 88 79 89 80 export function stopServers(): void { 90 81 try { 91 - execSync('pkill -f "llama-server"', { stdio: "ignore" }); 92 - log("Servers stopped."); 82 + execSync("pkill -f ollama", { stdio: "ignore" }); 83 + log("Ollama stopped."); 93 84 } catch { 94 - warn("No servers running."); 85 + warn("Ollama not running."); 95 86 } 96 87 }

+12 -27

src/commands/setup.ts

··· 1 1 import { checkPreflight } from "../steps/preflight.js"; 2 2 import { installHomebrew } from "../steps/homebrew.js"; 3 - import { installLlama } from "../steps/llama.js"; 3 + import { installOllama } from "../steps/llama.js"; 4 4 import { downloadModels } from "../steps/models.js"; 5 5 import { installTools } from "../steps/tools.js"; 6 6 import { createLauncherScripts } from "../steps/scripts.js"; 7 7 import { writeTuiConfig } from "../steps/aider-config.js"; 8 8 import { addToPath } from "../steps/shell-path.js"; 9 - import { MODELS_DIR, AIDER_CONFIG_FILE, AIDER_CONFIG_DIR } from "../config.js"; 9 + import { OLLAMA_PORT } from "../config.js"; 10 10 import { 11 11 getActiveChatModel, 12 12 getActiveAutocompleteModel, ··· 37 37 `${GREEN}${BOLD}═══════════════════════════════════════════════════${RESET}`, 38 38 ); 39 39 console.log(""); 40 - console.log(` ${BOLD}Models downloaded to:${RESET} ${MODELS_DIR}`); 40 + console.log(` ${BOLD}Backend:${RESET} Ollama on port ${OLLAMA_PORT}`); 41 41 console.log( 42 - ` Chat: ${chatModel.name} (${chatModel.file}, ~${chatModel.sizeGB}GB)`, 42 + ` Chat: ${chatModel.name} (${chatModel.ollamaTag})`, 43 43 ); 44 44 console.log( 45 - ` Autocomplete: ${autocompleteModel.name} (${autocompleteModel.file}, ~${autocompleteModel.sizeGB}GB)`, 45 + ` Autocomplete: ${autocompleteModel.name} (${autocompleteModel.ollamaTag})`, 46 46 ); 47 47 console.log(""); 48 48 console.log(` ${BOLD}Active TUI:${RESET} ${tui.name}`); 49 49 console.log(""); 50 - console.log(` ${BOLD}Commands available${RESET} (restart your shell first):`); 51 - console.log(""); 52 - console.log(` ${BOLD}llama-start${RESET} Start both llama.cpp servers`); 53 - console.log(` ${BOLD}llama-stop${RESET} Stop all llama.cpp servers`); 54 - console.log(""); 55 - console.log( 56 - ` ${BOLD}ai-code${RESET} [dir] Full coding agent (auto-starts server)`, 57 - ); 58 - console.log( 59 - ` ${BOLD}ai-ask${RESET} "question" Quick coding Q&A, no file edits`, 60 - ); 61 - console.log( 62 - ` ${BOLD}ai-pipe${RESET} "prompt" Pipe code through the model`, 63 - ); 64 - console.log(""); 65 - console.log(` ${BOLD}Config:${RESET} ${AIDER_CONFIG_FILE}`); 66 - console.log(` ${BOLD}API env:${RESET} ${AIDER_CONFIG_DIR}/.env`); 67 - console.log( 68 - ` ${BOLD}Server logs:${RESET} /tmp/llama-chat.log, /tmp/llama-complete.log`, 69 - ); 50 + console.log(` ${BOLD}Usage:${RESET}`); 51 + console.log(` ${BOLD}localcode${RESET} Launch ${tui.name} in current directory`); 52 + console.log(` ${BOLD}localcode start${RESET} Start Ollama + pull models`); 53 + console.log(` ${BOLD}localcode stop${RESET} Stop Ollama`); 54 + console.log(` ${BOLD}localcode status${RESET} Show config and server health`); 70 55 console.log(""); 71 56 console.log( 72 57 ` Run ${BOLD}source ${shellRC}${RESET} or open a new terminal to get started.`, ··· 76 61 77 62 export async function runSetup(): Promise<void> { 78 63 console.log( 79 - `\n${BOLD}Local AI Coding Environment Installer (llama.cpp)${RESET}\n`, 64 + `\n${BOLD}Local AI Coding Environment Installer (Ollama)${RESET}\n`, 80 65 ); 81 66 82 67 checkPreflight(); 83 68 installHomebrew(); 84 - installLlama(); 69 + installOllama(); 85 70 downloadModels(); 86 71 installTools(); 87 72 await createLauncherScripts();

+7 -10

src/commands/status.ts

··· 1 - import { CHAT_PORT, AUTOCOMPLETE_PORT, LAUNCH_DIR } from "../config.js"; 1 + import { OLLAMA_URL, OLLAMA_PORT } from "../config.js"; 2 2 import { 3 3 getActiveChatModel, 4 4 getActiveAutocompleteModel, ··· 11 11 const DIM = "\x1b[2m"; 12 12 const RESET = "\x1b[0m"; 13 13 14 - async function checkHealth(port: number): Promise<boolean> { 14 + async function checkOllama(): Promise<boolean> { 15 15 try { 16 - const res = await fetch(`http://127.0.0.1:${port}/health`); 16 + const res = await fetch(`${OLLAMA_URL}/api/tags`); 17 17 return res.ok; 18 18 } catch { 19 19 return false; ··· 25 25 const autoModel = getActiveAutocompleteModel(); 26 26 const tui = getActiveTui(); 27 27 28 - const chatOk = await checkHealth(CHAT_PORT); 29 - const autoOk = await checkHealth(AUTOCOMPLETE_PORT); 28 + const ollamaOk = await checkOllama(); 30 29 31 30 const on = `${GREEN}running${RESET}`; 32 31 const off = `${RED}stopped${RESET}`; ··· 34 33 console.log(` 35 34 ${BOLD}localcode${RESET} — current configuration 36 35 37 - ${BOLD}Chat model:${RESET} ${chatModel.name} ${DIM}(${chatModel.id})${RESET} 38 - ${BOLD}Autocomplete model:${RESET} ${autoModel.name} ${DIM}(${autoModel.id})${RESET} 36 + ${BOLD}Chat model:${RESET} ${chatModel.name} ${DIM}(${chatModel.ollamaTag})${RESET} 37 + ${BOLD}Autocomplete model:${RESET} ${autoModel.name} ${DIM}(${autoModel.ollamaTag})${RESET} 39 38 ${BOLD}Active TUI:${RESET} ${tui.name} ${DIM}(${tui.id})${RESET} 40 39 41 - ${BOLD}Chat server:${RESET} :${CHAT_PORT} ${chatOk ? on : off} 42 - ${BOLD}Autocomplete:${RESET} :${AUTOCOMPLETE_PORT} ${autoOk ? on : off} 43 - ${BOLD}Scripts:${RESET} ${LAUNCH_DIR} 40 + ${BOLD}Ollama:${RESET} :${OLLAMA_PORT} ${ollamaOk ? on : off} 44 41 `); 45 42 }

+4 -3

src/config.ts

··· 1 1 import { homedir } from "node:os"; 2 2 import { join } from "node:path"; 3 3 4 - export const MODELS_DIR = join(homedir(), ".local/share/llama-models"); 5 - export const CHAT_PORT = 8080; 6 - export const AUTOCOMPLETE_PORT = 8081; 4 + export const OLLAMA_HOST = "127.0.0.1"; 5 + export const OLLAMA_PORT = 11434; 6 + export const OLLAMA_URL = `http://${OLLAMA_HOST}:${OLLAMA_PORT}`; 7 7 export const LAUNCH_DIR = join(homedir(), ".local/bin"); 8 8 export const AIDER_CONFIG_DIR = join(homedir(), ".aider"); 9 9 export const AIDER_CONFIG_FILE = join(AIDER_CONFIG_DIR, "aider.conf.yml"); ··· 12 12 export const OPENCODE_CONFIG_FILE = join(OPENCODE_CONFIG_DIR, "opencode.json"); 13 13 export const PI_CONFIG_DIR = join(homedir(), ".pi", "agent"); 14 14 export const PI_MODELS_FILE = join(PI_CONFIG_DIR, "models.json"); 15 + export const PI_SETTINGS_FILE = join(PI_CONFIG_DIR, "settings.json");

+32 -37

src/main.ts

··· 13 13 14 14 function printUsage(): void { 15 15 console.log(` 16 - ${BOLD}localcode${RESET} — local AI coding environment 16 + ${BOLD}localcode${RESET} — local AI coding environment (Ollama) 17 17 18 18 ${BOLD}Usage:${RESET} 19 19 localcode ${DIM}[flags...]${RESET} Launch active TUI in current directory 20 20 localcode status Show current config and server health 21 21 22 22 ${BOLD}Server:${RESET} 23 - localcode start Start chat + autocomplete servers 24 - localcode stop Stop all servers 23 + localcode start Start Ollama + pull models 24 + localcode stop Stop Ollama 25 25 26 - ${BOLD}Models:${RESET} 27 - localcode models List available models 28 - localcode models set-chat <id> Switch the chat model 29 - localcode models set-auto <id> Switch the autocomplete model 30 - 31 - ${BOLD}TUIs:${RESET} 32 - localcode tuis List available TUIs 33 - localcode tuis set <id> Switch the active TUI 26 + ${BOLD}Config:${RESET} 27 + localcode model List available models 28 + localcode set model <id> Switch the chat model 29 + localcode set autocomplete <id> Switch the autocomplete model 30 + localcode tui List available TUIs 31 + localcode set tui <id> Switch the active TUI 34 32 35 33 ${BOLD}Benchmark:${RESET} 36 34 localcode bench Benchmark the running chat model ··· 38 36 39 37 ${BOLD}Other:${RESET} 40 38 localcode pipe "prompt" Pipe stdin through the model 41 - localcode setup Full install (models, tools, scripts) 39 + localcode setup Full install (Ollama, models, tools) 42 40 `); 43 41 } 44 42 ··· 62 60 stopServers(); 63 61 break; 64 62 65 - case "models": { 66 - const sub = process.argv[3]; 67 - if (!sub) { 68 - listModels(); 69 - } else if (sub === "set-chat") { 70 - const id = process.argv[4]; 63 + case "model": 64 + case "models": 65 + listModels(); 66 + break; 67 + 68 + case "tui": 69 + case "tuis": 70 + listTuis(); 71 + break; 72 + 73 + case "set": { 74 + const what = process.argv[3]; 75 + const id = process.argv[4]; 76 + 77 + if (what === "model" || what === "chat") { 71 78 if (!id) { 72 - console.error("Usage: localcode models set-chat <model-id>"); 79 + console.error("Usage: localcode set model <model-id>"); 73 80 process.exit(1); 74 81 } 75 82 await setChatModel(id); 76 - } else if (sub === "set-auto" || sub === "set-autocomplete") { 77 - const id = process.argv[4]; 83 + } else if (what === "autocomplete" || what === "auto") { 78 84 if (!id) { 79 - console.error("Usage: localcode models set-auto <model-id>"); 85 + console.error("Usage: localcode set autocomplete <model-id>"); 80 86 process.exit(1); 81 87 } 82 88 await setAutocompleteModel(id); 83 - } else { 84 - console.error(`Unknown: localcode models ${sub}`); 85 - process.exit(1); 86 - } 87 - break; 88 - } 89 - 90 - case "tuis": { 91 - const sub = process.argv[3]; 92 - if (!sub) { 93 - listTuis(); 94 - } else if (sub === "set") { 95 - const id = process.argv[4]; 89 + } else if (what === "tui") { 96 90 if (!id) { 97 - console.error("Usage: localcode tuis set <tui-id>"); 91 + console.error("Usage: localcode set tui <tui-id>"); 98 92 process.exit(1); 99 93 } 100 94 await setTui(id); 101 95 } else { 102 - console.error(`Unknown: localcode tuis ${sub}`); 96 + console.error(`Unknown: localcode set ${what ?? ""}`); 97 + console.error("Usage: localcode set model|autocomplete|tui <id>"); 103 98 process.exit(1); 104 99 } 105 100 break;

+11 -16

src/registry/models.ts

··· 2 2 id: string; 3 3 name: string; 4 4 role: "chat" | "autocomplete"; 5 - file: string; 6 - url: string; 7 - sizeGB: number; 5 + ollamaTag: string; 8 6 ctxSize: number; 9 - minRAMGB: number; 10 7 } 11 8 12 9 export const MODELS: ModelDef[] = [ ··· 14 11 id: "qwen-32b-chat", 15 12 name: "Qwen 2.5 Coder 32B", 16 13 role: "chat", 17 - file: "qwen2.5-coder-32b-instruct-q4_k_m.gguf", 18 - url: "https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct-GGUF/resolve/main/qwen2.5-coder-32b-instruct-q4_k_m.gguf", 19 - sizeGB: 20, 14 + ollamaTag: "qwen2.5-coder:32b", 20 15 ctxSize: 16384, 21 - minRAMGB: 32, 22 16 }, 23 17 { 24 18 id: "qwen-14b-chat", 25 19 name: "Qwen 2.5 Coder 14B", 26 20 role: "chat", 27 - file: "qwen2.5-coder-14b-instruct-q4_k_m.gguf", 28 - url: "https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct-GGUF/resolve/main/qwen2.5-coder-14b-instruct-q4_k_m.gguf", 29 - sizeGB: 9, 21 + ollamaTag: "qwen2.5-coder:14b", 22 + ctxSize: 16384, 23 + }, 24 + { 25 + id: "qwen-7b-chat", 26 + name: "Qwen 2.5 Coder 7B", 27 + role: "chat", 28 + ollamaTag: "qwen2.5-coder:7b", 30 29 ctxSize: 16384, 31 - minRAMGB: 16, 32 30 }, 33 31 { 34 32 id: "qwen-1.5b-autocomplete", 35 33 name: "Qwen 2.5 Coder 1.5B", 36 34 role: "autocomplete", 37 - file: "qwen2.5-coder-1.5b-instruct-q4_k_m.gguf", 38 - url: "https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF/resolve/main/qwen2.5-coder-1.5b-instruct-q4_k_m.gguf", 39 - sizeGB: 1.2, 35 + ollamaTag: "qwen2.5-coder:1.5b", 40 36 ctxSize: 4096, 41 - minRAMGB: 8, 42 37 }, 43 38 ]; 44 39

+3

src/registry/tuis.ts

··· 4 4 installCmd: string; 5 5 checkCmd: string; 6 6 launchArgs: string; 7 + /** Extra args to pass when resuming a previous session. */ 8 + resumeArgs?: string[]; 7 9 } 8 10 9 11 export const TUIS: TuiDef[] = [ ··· 27 29 installCmd: "npm install -g @mariozechner/pi-coding-agent", 28 30 checkCmd: "pi", 29 31 launchArgs: '"$@"', 32 + resumeArgs: ["--continue"], 30 33 }, 31 34 ]; 32 35

+3 -1

src/steps/aider-config.ts

··· 5 5 AIDER_ENV_FILE, 6 6 OPENCODE_CONFIG_FILE, 7 7 PI_MODELS_FILE, 8 + PI_SETTINGS_FILE, 8 9 } from "../config.js"; 9 10 import { getActiveChatModel, getActiveTui } from "../runtime-config.js"; 10 11 import { aiderConfig, aiderEnv } from "../templates/aider.js"; 11 12 import { opencodeConfig } from "../templates/opencode.js"; 12 - import { piModelsConfig } from "../templates/pi.js"; 13 + import { piModelsConfig, piSettingsConfig } from "../templates/pi.js"; 13 14 14 15 export async function writeTuiConfig(): Promise<void> { 15 16 const chatModel = getActiveChatModel(); ··· 30 31 PI_MODELS_FILE, 31 32 piModelsConfig(chatModel.id, chatModel.name), 32 33 ); 34 + await writeConfig(PI_SETTINGS_FILE, piSettingsConfig()); 33 35 log(`Pi config written to ${PI_MODELS_FILE}`); 34 36 35 37 log(`Active TUI: ${tui.name}`);

+7 -19

src/steps/llama.ts

··· 1 - import { log, warn } from "../log.js"; 2 - import { commandExists, run, runPassthrough } from "../util.js"; 1 + import { log } from "../log.js"; 2 + import { commandExists, runPassthrough } from "../util.js"; 3 3 4 - export function installLlama(): void { 5 - if (commandExists("llama-server")) { 6 - log("llama.cpp already installed."); 4 + export function installOllama(): void { 5 + if (commandExists("ollama")) { 6 + log("Ollama already installed."); 7 7 } else { 8 - log("Installing llama.cpp via Homebrew..."); 9 - runPassthrough("brew install llama.cpp"); 10 - } 11 - 12 - // Verify Metal support 13 - try { 14 - const help = run("llama-server --help 2>&1", { silent: true }); 15 - if (/metal/i.test(help)) { 16 - log("Metal (GPU) acceleration available."); 17 - } else { 18 - warn("Metal flag not detected — model will run on CPU only."); 19 - } 20 - } catch { 21 - warn("Could not verify Metal support."); 8 + log("Installing Ollama via Homebrew..."); 9 + runPassthrough("brew install ollama"); 22 10 } 23 11 }

+17 -13

src/steps/models.ts

··· 1 - import { existsSync, mkdirSync } from "node:fs"; 2 - import { join } from "node:path"; 1 + import { execSync } from "node:child_process"; 3 2 import { log } from "../log.js"; 4 3 import { runPassthrough } from "../util.js"; 5 - import { MODELS_DIR } from "../config.js"; 6 4 import { getActiveChatModel, getActiveAutocompleteModel } from "../runtime-config.js"; 7 5 import type { ModelDef } from "../registry/models.js"; 8 6 9 - function downloadModel(model: ModelDef): void { 10 - const dest = join(MODELS_DIR, model.file); 11 - if (existsSync(dest)) { 12 - log(`Model already downloaded: ${model.file}`); 7 + function isPulled(ollamaTag: string): boolean { 8 + try { 9 + const output = execSync("ollama list", { encoding: "utf-8", stdio: ["pipe", "pipe", "pipe"] }); 10 + return output.includes(ollamaTag); 11 + } catch { 12 + return false; 13 + } 14 + } 15 + 16 + function pullModel(model: ModelDef): void { 17 + if (isPulled(model.ollamaTag)) { 18 + log(`Model already pulled: ${model.ollamaTag}`); 13 19 return; 14 20 } 15 21 16 - log(`Downloading ${model.name} (${model.sizeGB}GB, this may take a while)...`); 17 - runPassthrough(`curl -L --progress-bar -o "${dest}" "${model.url}"`); 18 - log(`Downloaded: ${model.file}`); 22 + log(`Pulling ${model.name} (${model.ollamaTag})...`); 23 + runPassthrough(`ollama pull ${model.ollamaTag}`); 19 24 } 20 25 21 26 export function downloadModels(): void { 22 - mkdirSync(MODELS_DIR, { recursive: true }); 23 - downloadModel(getActiveChatModel()); 24 - downloadModel(getActiveAutocompleteModel()); 27 + pullModel(getActiveChatModel()); 28 + pullModel(getActiveAutocompleteModel()); 25 29 }

+3 -22

src/steps/scripts.ts

··· 2 2 import { log } from "../log.js"; 3 3 import { writeExecutable } from "../util.js"; 4 4 import { LAUNCH_DIR } from "../config.js"; 5 - import { 6 - getActiveChatModel, 7 - getActiveAutocompleteModel, 8 - } from "../runtime-config.js"; 9 - import { 10 - llamaChatServer, 11 - llamaCompleteServer, 12 - localcodeWrapper, 13 - } from "../templates/scripts.js"; 5 + import { localcodeWrapper } from "../templates/scripts.js"; 14 6 15 7 export async function createLauncherScripts(): Promise<void> { 16 - const chatModel = getActiveChatModel(); 17 - const autocompleteModel = getActiveAutocompleteModel(); 18 - 19 8 // Resolve the project directory (where dist/main.js lives) 20 9 const projectDir = join(import.meta.dirname, "../.."); 21 10 22 - const scripts: [string, string][] = [ 23 - ["llama-chat-server", llamaChatServer(chatModel)], 24 - ["llama-complete-server", llamaCompleteServer(autocompleteModel)], 25 - ["localcode", localcodeWrapper(projectDir)], 26 - ]; 27 - 28 - for (const [name, content] of scripts) { 29 - await writeExecutable(join(LAUNCH_DIR, name), content); 30 - } 31 - log(`Launcher scripts written to ${LAUNCH_DIR}`); 11 + await writeExecutable(join(LAUNCH_DIR, "localcode"), localcodeWrapper(projectDir)); 12 + log(`Launcher script written to ${LAUNCH_DIR}`); 32 13 }

+6 -5

src/templates/aider.ts

··· 1 + import { OLLAMA_PORT } from "../config.js"; 2 + 1 3 export function aiderConfig(modelName: string): string { 2 4 return `# ============================================================================= 3 - # Aider Configuration — ${modelName} via llama.cpp 5 + # Aider Configuration — ${modelName} via Ollama 4 6 # ============================================================================= 5 7 6 - # Point Aider at llama.cpp's OpenAI-compatible endpoint 7 - # The model name can be anything — llama.cpp ignores it and uses the loaded model 8 + # Point Aider at Ollama's OpenAI-compatible endpoint 8 9 model: openai/${modelName} 9 10 10 11 # Architect mode for better code planning ··· 32 33 } 33 34 34 35 export function aiderEnv(): string { 35 - return `# llama.cpp serves an OpenAI-compatible API — no real key needed 36 + return `# Ollama serves an OpenAI-compatible API — no real key needed 36 37 OPENAI_API_KEY=sk-not-needed 37 - OPENAI_API_BASE=http://127.0.0.1:8080/v1 38 + OPENAI_API_BASE=http://127.0.0.1:${OLLAMA_PORT}/v1 38 39 `; 39 40 }

+5 -7

src/templates/opencode.ts

··· 1 - import { CHAT_PORT } from "../config.js"; 1 + import { OLLAMA_PORT } from "../config.js"; 2 2 3 3 export function opencodeConfig(modelId: string, modelName: string): string { 4 4 return JSON.stringify( 5 5 { 6 - $schema: "https://opencode.ai/config.json", 7 - model: `llama-cpp/${modelId}`, 6 + model: `ollama/${modelId}`, 8 7 provider: { 9 - "llama-cpp": { 8 + ollama: { 10 9 npm: "@ai-sdk/openai-compatible", 11 - name: "llama.cpp (local)", 10 + name: "Ollama (local)", 12 11 options: { 13 - baseURL: `http://127.0.0.1:${CHAT_PORT}/v1`, 12 + baseURL: `http://127.0.0.1:${OLLAMA_PORT}/v1`, 14 13 apiKey: "not-needed", 15 14 }, 16 15 models: { 17 16 [modelId]: { 18 17 name: modelName, 19 - tools: true, 20 18 }, 21 19 }, 22 20 },

+13 -3

src/templates/pi.ts

··· 1 - import { CHAT_PORT } from "../config.js"; 1 + import { OLLAMA_PORT } from "../config.js"; 2 2 3 3 export function piModelsConfig(modelId: string, modelName: string): string { 4 4 return JSON.stringify( 5 5 { 6 6 providers: { 7 - "llama-cpp": { 8 - baseUrl: `http://127.0.0.1:${CHAT_PORT}/v1`, 7 + ollama: { 8 + baseUrl: `http://127.0.0.1:${OLLAMA_PORT}/v1`, 9 9 api: "openai-completions", 10 10 apiKey: "not-needed", 11 11 models: [ ··· 21 21 2, 22 22 ) + "\n"; 23 23 } 24 + 25 + export function piSettingsConfig(): string { 26 + return JSON.stringify( 27 + { 28 + packages: ["@gordonb/pi-memory-blocks"], 29 + }, 30 + null, 31 + 2, 32 + ) + "\n"; 33 + }

+2 -44

src/templates/scripts.ts

··· 1 - import { MODELS_DIR, CHAT_PORT, AUTOCOMPLETE_PORT } from "../config.js"; 2 - import type { ModelDef } from "../registry/models.js"; 3 - 4 - // Use this to emit a literal $ in template literals without triggering interpolation. 5 - const D = "$"; 6 - 7 - export function llamaChatServer(model: ModelDef): string { 8 - return `#!/usr/bin/env bash 9 - # Start llama.cpp server with ${model.name} for chat 10 - # Exposed as OpenAI-compatible API on port ${CHAT_PORT} 11 - 12 - MODEL="${MODELS_DIR}/${model.file}" 13 - 14 - exec llama-server \\ 15 - --model "${D}MODEL" \\ 16 - --port ${CHAT_PORT} \\ 17 - --host 127.0.0.1 \\ 18 - --ctx-size ${model.ctxSize} \\ 19 - --n-gpu-layers 99 \\ 20 - --threads ${D}(sysctl -n hw.perflevel0.logicalcpu 2>/dev/null || echo 4) \\ 21 - --mlock \\ 22 - "${D}@" 23 - `; 24 - } 25 - 26 - export function llamaCompleteServer(model: ModelDef): string { 27 - return `#!/usr/bin/env bash 28 - # Start llama.cpp server with ${model.name} for autocomplete 29 - # Exposed as OpenAI-compatible API on port ${AUTOCOMPLETE_PORT} 30 - 31 - MODEL="${MODELS_DIR}/${model.file}" 32 - 33 - exec llama-server \\ 34 - --model "${D}MODEL" \\ 35 - --port ${AUTOCOMPLETE_PORT} \\ 36 - --host 127.0.0.1 \\ 37 - --ctx-size ${model.ctxSize} \\ 38 - --n-gpu-layers 99 \\ 39 - --threads ${D}(sysctl -n hw.perflevel0.logicalcpu 2>/dev/null || echo 4) \\ 40 - --mlock \\ 41 - "${D}@" 42 - `; 43 - } 44 - 45 1 export function localcodeWrapper(projectDir: string): string { 2 + // Use this to emit a literal $ in template literals without triggering interpolation. 3 + const D = "$"; 46 4 return `#!/usr/bin/env bash 47 5 # localcode — local AI coding environment manager 48 6 exec node "${projectDir}/dist/main.js" "${D}@"