Files

2025-11-12 11:55:53 +01:00

8.1 KiB

Raw Blame History

Changing the AI Model

Overview

The system uses Ollama for AI-powered features (summarization, clustering, neutral summaries). You can easily change the model by updating the .env file.

Current Configuration

Default Model: phi3:latest

The model is configured in backend/.env:

OLLAMA_MODEL=phi3:latest

✅ How to Change the Model

Important Note

✅ The model IS automatically checked and downloaded on startup

The ollama-setup service runs on every docker-compose up and:

Checks if the model specified in .env exists
Downloads it if missing
Skips download if already present

This means you can simply:

Change OLLAMA_MODEL in .env
Run docker-compose up -d
Wait for download (if needed)
Done!

Step 1: Update .env File

Edit backend/.env and change the OLLAMA_MODEL value:

# Example: Change to a different model
OLLAMA_MODEL=llama3:latest

# Or use a specific version
OLLAMA_MODEL=mistral:7b

# Or use a custom model
OLLAMA_MODEL=your-custom-model:latest

Step 2: Restart Services (Model Auto-Downloads)

Option A: Simple restart (Recommended)

# Restart all services
docker-compose up -d

# Watch the model check/download
docker-compose logs -f ollama-setup

The ollama-setup service will:

Check if the new model exists
Download it if missing (2-10 minutes)
Skip download if already present

Option B: Manual pull (if you want control)

# Pull the model manually first
./pull-ollama-model.sh

# Then restart
docker-compose restart crawler backend

Option C: Full restart

docker-compose down
docker-compose up -d

Note: Model download takes 2-10 minutes depending on model size.

Supported Models

Recommended Models

Model	Size	Speed	Quality	Best For
`phi3:latest`	2.3GB	⚡⚡⚡	⭐⭐⭐	Default - Fast, good quality
`llama3:8b`	4.7GB	⚡⚡	⭐⭐⭐⭐	Better quality, slower
`mistral:7b`	4.1GB	⚡⚡	⭐⭐⭐⭐	Balanced performance
`gemma:7b`	5.0GB	⚡⚡	⭐⭐⭐⭐	Google's model

Lightweight Models (Faster)

Model	Size	Speed	Quality
`phi3:mini`	2.3GB	⚡⚡⚡	⭐⭐⭐
`tinyllama:latest`	637MB	⚡⚡⚡⚡	⭐⭐
`qwen:0.5b`	397MB	⚡⚡⚡⚡	⭐⭐

High-Quality Models (Slower)

Model	Size	Speed	Quality
`llama3:70b`	40GB	⚡	⭐⭐⭐⭐⭐
`mixtral:8x7b`	26GB	⚡	⭐⭐⭐⭐⭐

Full list: https://ollama.ai/library

Manual Model Management

Pull Model Manually

# Pull a specific model
docker-compose exec ollama ollama pull llama3:latest

# Pull multiple models
docker-compose exec ollama ollama pull mistral:7b
docker-compose exec ollama ollama pull phi3:latest

List Available Models

docker-compose exec ollama ollama list

Remove Unused Models

# Remove a specific model
docker-compose exec ollama ollama rm phi3:latest

# Free up space
docker-compose exec ollama ollama prune

Testing the New Model

Test via API

curl http://localhost:5001/api/ollama/test

Test Summarization

docker-compose exec crawler python << 'EOF'
from ollama_client import OllamaClient
from config import Config

client = OllamaClient(
    base_url=Config.OLLAMA_BASE_URL,
    model=Config.OLLAMA_MODEL,
    enabled=True
)

result = client.summarize_article(
    "This is a test article about Munich news. The city council made important decisions today.",
    max_words=50
)

print(f"Model: {Config.OLLAMA_MODEL}")
print(f"Success: {result['success']}")
print(f"Summary: {result['summary']}")
print(f"Duration: {result['duration']:.2f}s")
EOF

Test Clustering

docker-compose exec crawler python tests/crawler/test_clustering_real.py

Performance Comparison

Summarization Speed (per article)

Model	CPU	GPU (NVIDIA)
phi3:latest	~15s	~3s
llama3:8b	~25s	~5s
mistral:7b	~20s	~4s
llama3:70b	~120s	~15s

Memory Requirements

Model	RAM	VRAM (GPU)
phi3:latest	4GB	2GB
llama3:8b	8GB	4GB
mistral:7b	8GB	4GB
llama3:70b	48GB	40GB

Troubleshooting

Model Not Found

# Check if model exists
docker-compose exec ollama ollama list

# Pull the model manually
docker-compose exec ollama ollama pull your-model:latest

Out of Memory

If you get OOM errors:

Use a smaller model (e.g., phi3:mini)
Enable GPU acceleration (see GPU_SETUP.md)
Increase Docker memory limit

Slow Performance

Use GPU acceleration - 5-10x faster
Use smaller model - phi3:latest is fastest
Increase timeout in .env:
```
OLLAMA_TIMEOUT=300
```

Model Download Fails

# Check Ollama logs
docker-compose logs ollama

# Restart Ollama
docker-compose restart ollama

# Try manual pull
docker-compose exec ollama ollama pull phi3:latest

Custom Models

Using Your Own Model

Create/fine-tune your model using Ollama

Import it:

docker-compose exec ollama ollama create my-model -f Modelfile

Update .env:
```
OLLAMA_MODEL=my-model:latest
```
Restart services

Model Requirements

Your custom model should support:

Text generation
Prompt-based instructions
Reasonable response times (<60s per request)

Best Practices

For Production

Test thoroughly before switching models
Monitor performance after switching
Keep backup of old model until stable
Document model choice in your deployment notes

For Development

Use phi3:latest for fast iteration
Test with llama3:8b for quality checks
Profile performance with different models
Compare results between models

FAQ

Q: Can I use multiple models? A: Yes! Pull multiple models and switch by updating .env and restarting.

Q: Do I need to re-crawl articles? A: No. Existing summaries remain. New articles use the new model.

Q: Can I use OpenAI/Anthropic models? A: Not directly. Ollama only supports local models. For cloud APIs, you'd need to modify the OllamaClient class.

Q: Which model is best? A: For most users: phi3:latest (fast, good quality). For better quality: llama3:8b. For production with GPU: mistral:7b.

Q: How much disk space do I need? A: 5-10GB for small models, 50GB+ for large models. Plan accordingly.

OLLAMA_SETUP.md - Ollama installation & configuration
GPU_SETUP.md - GPU acceleration setup
AI_NEWS_AGGREGATION.md - AI features overview

Complete Example: Changing from phi3 to llama3

# 1. Check current model
curl -s http://localhost:5001/api/ollama/models | python3 -m json.tool
# Shows: "current_model": "phi3:latest"

# 2. Update .env file
# Edit backend/.env and change:
# OLLAMA_MODEL=llama3:8b

# 3. Pull the new model
./pull-ollama-model.sh
# Or manually: docker-compose exec ollama ollama pull llama3:8b

# 4. Restart services
docker-compose restart crawler backend

# 5. Verify the change
curl -s http://localhost:5001/api/ollama/models | python3 -m json.tool
# Shows: "current_model": "llama3:8b"

# 6. Test performance
curl -s http://localhost:5001/api/ollama/test | python3 -m json.tool
# Should show improved quality with llama3

Quick Reference

Change Model Workflow

# 1. Edit .env
vim backend/.env  # Change OLLAMA_MODEL

# 2. Pull model
./pull-ollama-model.sh

# 3. Restart
docker-compose restart crawler backend

# 4. Verify
curl http://localhost:5001/api/ollama/test

Common Commands

# List downloaded models
docker-compose exec ollama ollama list

# Pull a specific model
docker-compose exec ollama ollama pull mistral:7b

# Remove a model
docker-compose exec ollama ollama rm phi3:latest

# Check current config
curl http://localhost:5001/api/ollama/config

# Test performance
curl http://localhost:5001/api/ollama/test

8.1 KiB Raw Blame History