Deep Dive8 min read2026-03-02

How to Build Your Own Private ChatGPT for Free

Step-by-step: Ollama + Open WebUI + a good model = your own ChatGPT that never phones home. 15 minutes setup.

What You're Building

By the end of this guide, you'll have a ChatGPT-like web interface running on your own computer. It will:

  • Look and feel like ChatGPT (conversation history, markdown, code highlighting)
  • Run 100% locally — no internet needed, no data sent anywhere
  • Cost $0/month to operate
  • Support multiple models (switch between chat, coding, reasoning)
  • Have an OpenAI-compatible API for tools and integrations

Step 1: Install Ollama (2 minutes)

Mac/Linux: curl -fsSL https://ollama.com/install.sh | sh

Windows: Download from ollama.com/download

Verify: ollama --version

Step 2: Download a Model (3 minutes)

Choose based on your VRAM (run nvidia-smi or check System Info on Mac):

  • 8 GB: ollama pull qwen3:8b
  • 12-16 GB: ollama pull qwen3:14b
  • 24 GB: ollama pull qwen2.5-coder:32b

Not sure? Use FitMyLLM to find the best model for your exact hardware.

Step 3: Install Open WebUI (5 minutes)

With Docker (recommended):

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:main

Without Docker:

pip install open-webui && open-webui serve

Open http://localhost:3000 — you now have ChatGPT running locally.

Step 4: What You Can Do Now

  • Chat privately — medical questions, legal advice, personal conversations. Nothing is logged.
  • Code assistance — paste your codebase, ask for reviews, generate tests. No code leaves your machine.
  • Document analysis — upload PDFs and ask questions about them (RAG built into Open WebUI).
  • API access — any tool that works with OpenAI API works with Ollama. Just set OPENAI_API_BASE=http://localhost:11434/v1
  • Multiple models — switch between chat, coding, and reasoning models in the UI. Download as many as your storage allows.

Total cost: $0. Total setup time: 10-15 minutes. Total data sent to the cloud: zero bytes.

References & Further Reading

  1. [1]Ollama (2026). Ollama
  2. [2]Open WebUI (2026). Open WebUI
  3. [3]SitePoint (2026). Run Local LLMs 2026: Developer Guide

Find the best model for your hardware

Use FitMyLLM to get personalized recommendations based on your GPU, use case, and speed requirements.

Try FitMyLLM

Get weekly updates on new models, GPU deals, and benchmark results.

FitMyLLM — Find the best local AI model for your computer.