Replace llama.cpp backend with Ollama
Single Ollama server on port 11434 replaces two llama-server processes,
the tool-call rewriting proxy, manual GGUF downloads, and bash server
launcher scripts. Models managed via ollama pull with tags like
qwen2.5-coder:32b. Eliminates grammar-constrained decoding crashes.