Ollama Setup Guide: From Zero to Running AI in 10 Minutes
The no-bullshit guide to running your first local LLM. Install Ollama, pick a model, and start chatting — with zero cloud dependencies.
What is Ollama?
Ollama is the easiest way to run AI models on your own computer. One command to install, one command to run any model. No Python environment, no Docker, no config files.
It wraps llama.cpp (the fastest open-source inference engine) with a simple CLI and an OpenAI-compatible API. Most people on r/LocalLLaMA use Ollama as their daily driver.
Install (30 seconds)
macOS/Linux:
curl -fsSL https://ollama.com/install.sh | sh
Windows: Download from ollama.com/download
That's it. Ollama auto-detects your GPU (NVIDIA, AMD, Apple Silicon) and configures everything.
Pick Your First Model
Based on your VRAM:
| VRAM | Best starter model | Command |
|---|---|---|
| 4-6 GB | Phi-4 Mini 3.8B | ollama run phi4-mini |
| 8 GB | Qwen3 8B | ollama run qwen3:8b |
| 12 GB | Qwen3 14B | ollama run qwen3:14b |
| 16 GB | Mistral Small 24B (Q4) | ollama run mistral-small |
| 24 GB | Qwen 2.5 Coder 32B | ollama run qwen2.5-coder:32b |
Don't know your VRAM? Use FitMyLLM — it auto-detects your GPU and recommends the best model.
Add a Web UI (5 minutes)
Ollama runs in the terminal by default. For a ChatGPT-like interface:
Open WebUI (the most popular option):
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway ghcr.io/open-webui/open-webui:main
Then open http://localhost:3000. You now have a fully private ChatGPT running on your machine.
No Docker? Try LM Studio — a desktop app with built-in GUI. Or Jan — another great GUI option.
Essential Commands
ollama list— see downloaded modelsollama run qwen3:8b— chat with a modelollama pull llama3.3:70b— download without runningollama rm model-name— delete a modelollama ps— see what's running (and if it's using GPU)ollama run model --verbose— show speed stats
API access: Ollama runs an OpenAI-compatible server on localhost:11434. Any tool that works with OpenAI API works with Ollama — just change the base URL.
References & Further Reading
- [1]Ollama (2026). Ollama Official Documentation
- [2]SitePoint (2026). Ollama Setup Guide 2026
- [3]Open WebUI (2026). Open WebUI
Find the best model for your hardware
Use FitMyLLM to get personalized recommendations based on your GPU, use case, and speed requirements.
Try FitMyLLMGet weekly updates on new models, GPU deals, and benchmark results.