Trying to call import ollama in your Python script and getting ModuleNotFoundError: No module named ‘ollama’? The fix is one command, but make sure you install the right package and have the Ollama runtime running too. This 2026 guide covers both pieces in 3 minutes.

Two parts needed: (1) install the Python client with pip install ollama, (2) run the Ollama server locally so the client has something to connect to. The Python package alone does not include the model runtime.
Step 1: Install the Python client
pip install ollama
# Or with poetry
poetry add ollama
# Or with uv (faster)
uv pip install ollama
Step 2: Install + start the Ollama runtime
The Python client talks to a local Ollama server (default port 11434). Install Ollama if not yet on your machine:
- macOS / Linux:
curl -fsSL https://ollama.com/install.sh | sh - Windows: download installer from
ollama.com/download - Start the server:
ollama serve(or it auto-starts after install) - Pull a model:
ollama pull llama3.1
Step 3: Verify the install with a quick test
import ollama
response = ollama.chat(model='llama3.1', messages=[
{'role': 'user', 'content': 'Why is the sky blue?'}
])
print(response['message']['content'])
Why this error happens
| Cause | Fix |
|---|---|
| Package not installed | pip install ollama |
| Wrong virtualenv active | Activate correct venv then reinstall |
| Installed with pip but using pip3 (or vice versa) | Use python -m pip install ollama |
| Jupyter notebook using different kernel | !pip install ollama inside the notebook |
Bonus: async usage
from ollama import AsyncClient
import asyncio
async def chat():
client = AsyncClient()
response = await client.chat(
model='llama3.1',
messages=[{'role': 'user', 'content': 'Hello!'}]
)
print(response['message']['content'])
asyncio.run(chat())
Frequently Asked Questions
Is ollama free to install and use?
Yes. Both the Ollama runtime and the Python client are open-source (MIT license). You can run models like Llama 3, Mistral, Phi, Gemma locally with no API costs. Hardware permitting (8GB+ RAM for small models, 16GB+ for 7B), everything runs offline.
Why does ollama.chat() fail even after pip install ollama?
The Python client connects to a local server. If the Ollama runtime isn’t running (or isn’t installed), you’ll get connection errors. Run “ollama serve” in a terminal or check the menu bar (macOS) / system tray (Windows) for the Ollama process.
Can I use ollama with LangChain?
Yes. Install both: pip install ollama langchain-ollama. Then use ChatOllama from langchain_ollama. This works for RAG, agents, and chains using your local Llama or Mistral model instead of OpenAI.
What models does Ollama support in 2026?
Llama 3.1, Llama 3.2 (multimodal), Mistral Nemo, Mistral Small, Phi-3.5, Gemma 2, Qwen 2.5, DeepSeek-Coder, CodeLlama, plus dozens of community variants. Pull any with: ollama pull modelname.
Can Ollama run in Docker or a server?
Yes. Official image: docker run -d -p 11434:11434 ollama/ollama. Then set OLLAMA_HOST=http://your-server:11434 in your Python script. Great for shared dev environments or capstone deployment.
