8.1 KiB
Changing the AI Model
Overview
The system uses Ollama for AI-powered features (summarization, clustering, neutral summaries). You can easily change the model by updating the .env file.
Current Configuration
Default Model: phi3:latest
The model is configured in backend/.env:
OLLAMA_MODEL=phi3:latest
✅ How to Change the Model
Important Note
✅ The model IS automatically checked and downloaded on startup
The ollama-setup service runs on every docker-compose up and:
- Checks if the model specified in
.envexists - Downloads it if missing
- Skips download if already present
This means you can simply:
- Change
OLLAMA_MODELin.env - Run
docker-compose up -d - Wait for download (if needed)
- Done!
Step 1: Update .env File
Edit backend/.env and change the OLLAMA_MODEL value:
# Example: Change to a different model
OLLAMA_MODEL=llama3:latest
# Or use a specific version
OLLAMA_MODEL=mistral:7b
# Or use a custom model
OLLAMA_MODEL=your-custom-model:latest
Step 2: Restart Services (Model Auto-Downloads)
Option A: Simple restart (Recommended)
# Restart all services
docker-compose up -d
# Watch the model check/download
docker-compose logs -f ollama-setup
The ollama-setup service will:
- Check if the new model exists
- Download it if missing (2-10 minutes)
- Skip download if already present
Option B: Manual pull (if you want control)
# Pull the model manually first
./pull-ollama-model.sh
# Then restart
docker-compose restart crawler backend
Option C: Full restart
docker-compose down
docker-compose up -d
Note: Model download takes 2-10 minutes depending on model size.
Supported Models
Recommended Models
| Model | Size | Speed | Quality | Best For |
|---|---|---|---|---|
phi3:latest |
2.3GB | ⚡⚡⚡ | ⭐⭐⭐ | Default - Fast, good quality |
llama3:8b |
4.7GB | ⚡⚡ | ⭐⭐⭐⭐ | Better quality, slower |
mistral:7b |
4.1GB | ⚡⚡ | ⭐⭐⭐⭐ | Balanced performance |
gemma:7b |
5.0GB | ⚡⚡ | ⭐⭐⭐⭐ | Google's model |
Lightweight Models (Faster)
| Model | Size | Speed | Quality |
|---|---|---|---|
phi3:mini |
2.3GB | ⚡⚡⚡ | ⭐⭐⭐ |
tinyllama:latest |
637MB | ⚡⚡⚡⚡ | ⭐⭐ |
qwen:0.5b |
397MB | ⚡⚡⚡⚡ | ⭐⭐ |
High-Quality Models (Slower)
| Model | Size | Speed | Quality |
|---|---|---|---|
llama3:70b |
40GB | ⚡ | ⭐⭐⭐⭐⭐ |
mixtral:8x7b |
26GB | ⚡ | ⭐⭐⭐⭐⭐ |
Full list: https://ollama.ai/library
Manual Model Management
Pull Model Manually
# Pull a specific model
docker-compose exec ollama ollama pull llama3:latest
# Pull multiple models
docker-compose exec ollama ollama pull mistral:7b
docker-compose exec ollama ollama pull phi3:latest
List Available Models
docker-compose exec ollama ollama list
Remove Unused Models
# Remove a specific model
docker-compose exec ollama ollama rm phi3:latest
# Free up space
docker-compose exec ollama ollama prune
Testing the New Model
Test via API
curl http://localhost:5001/api/ollama/test
Test Summarization
docker-compose exec crawler python << 'EOF'
from ollama_client import OllamaClient
from config import Config
client = OllamaClient(
base_url=Config.OLLAMA_BASE_URL,
model=Config.OLLAMA_MODEL,
enabled=True
)
result = client.summarize_article(
"This is a test article about Munich news. The city council made important decisions today.",
max_words=50
)
print(f"Model: {Config.OLLAMA_MODEL}")
print(f"Success: {result['success']}")
print(f"Summary: {result['summary']}")
print(f"Duration: {result['duration']:.2f}s")
EOF
Test Clustering
docker-compose exec crawler python tests/crawler/test_clustering_real.py
Performance Comparison
Summarization Speed (per article)
| Model | CPU | GPU (NVIDIA) |
|---|---|---|
| phi3:latest | ~15s | ~3s |
| llama3:8b | ~25s | ~5s |
| mistral:7b | ~20s | ~4s |
| llama3:70b | ~120s | ~15s |
Memory Requirements
| Model | RAM | VRAM (GPU) |
|---|---|---|
| phi3:latest | 4GB | 2GB |
| llama3:8b | 8GB | 4GB |
| mistral:7b | 8GB | 4GB |
| llama3:70b | 48GB | 40GB |
Troubleshooting
Model Not Found
# Check if model exists
docker-compose exec ollama ollama list
# Pull the model manually
docker-compose exec ollama ollama pull your-model:latest
Out of Memory
If you get OOM errors:
- Use a smaller model (e.g.,
phi3:mini) - Enable GPU acceleration (see GPU_SETUP.md)
- Increase Docker memory limit
Slow Performance
- Use GPU acceleration - 5-10x faster
- Use smaller model - phi3:latest is fastest
- Increase timeout in
.env:OLLAMA_TIMEOUT=300
Model Download Fails
# Check Ollama logs
docker-compose logs ollama
# Restart Ollama
docker-compose restart ollama
# Try manual pull
docker-compose exec ollama ollama pull phi3:latest
Custom Models
Using Your Own Model
- Create/fine-tune your model using Ollama
- Import it:
docker-compose exec ollama ollama create my-model -f Modelfile - Update .env:
OLLAMA_MODEL=my-model:latest - Restart services
Model Requirements
Your custom model should support:
- Text generation
- Prompt-based instructions
- Reasonable response times (<60s per request)
Best Practices
For Production
- Test thoroughly before switching models
- Monitor performance after switching
- Keep backup of old model until stable
- Document model choice in your deployment notes
For Development
- Use phi3:latest for fast iteration
- Test with llama3:8b for quality checks
- Profile performance with different models
- Compare results between models
FAQ
Q: Can I use multiple models?
A: Yes! Pull multiple models and switch by updating .env and restarting.
Q: Do I need to re-crawl articles? A: No. Existing summaries remain. New articles use the new model.
Q: Can I use OpenAI/Anthropic models?
A: Not directly. Ollama only supports local models. For cloud APIs, you'd need to modify the OllamaClient class.
Q: Which model is best?
A: For most users: phi3:latest (fast, good quality). For better quality: llama3:8b. For production with GPU: mistral:7b.
Q: How much disk space do I need? A: 5-10GB for small models, 50GB+ for large models. Plan accordingly.
Related Documentation
- OLLAMA_SETUP.md - Ollama installation & configuration
- GPU_SETUP.md - GPU acceleration setup
- AI_NEWS_AGGREGATION.md - AI features overview
Complete Example: Changing from phi3 to llama3
# 1. Check current model
curl -s http://localhost:5001/api/ollama/models | python3 -m json.tool
# Shows: "current_model": "phi3:latest"
# 2. Update .env file
# Edit backend/.env and change:
# OLLAMA_MODEL=llama3:8b
# 3. Pull the new model
./pull-ollama-model.sh
# Or manually: docker-compose exec ollama ollama pull llama3:8b
# 4. Restart services
docker-compose restart crawler backend
# 5. Verify the change
curl -s http://localhost:5001/api/ollama/models | python3 -m json.tool
# Shows: "current_model": "llama3:8b"
# 6. Test performance
curl -s http://localhost:5001/api/ollama/test | python3 -m json.tool
# Should show improved quality with llama3
Quick Reference
Change Model Workflow
# 1. Edit .env
vim backend/.env # Change OLLAMA_MODEL
# 2. Pull model
./pull-ollama-model.sh
# 3. Restart
docker-compose restart crawler backend
# 4. Verify
curl http://localhost:5001/api/ollama/test
Common Commands
# List downloaded models
docker-compose exec ollama ollama list
# Pull a specific model
docker-compose exec ollama ollama pull mistral:7b
# Remove a model
docker-compose exec ollama ollama rm phi3:latest
# Check current config
curl http://localhost:5001/api/ollama/config
# Test performance
curl http://localhost:5001/api/ollama/test