Files

2025-11-12 11:42:02 +01:00

6.1 KiB

Raw Blame History

Changing the AI Model

Overview

The system uses Ollama for AI-powered features (summarization, clustering, neutral summaries). You can easily change the model by updating the .env file.

Current Configuration

Default Model: phi3:latest

The model is configured in backend/.env:

OLLAMA_MODEL=phi3:latest

✅ How to Change the Model

Step 1: Update .env File

Edit backend/.env and change the OLLAMA_MODEL value:

# Example: Change to a different model
OLLAMA_MODEL=llama3:latest

# Or use a specific version
OLLAMA_MODEL=mistral:7b

# Or use a custom model
OLLAMA_MODEL=your-custom-model:latest

Step 2: Restart Services

The model will be automatically downloaded on startup:

# Stop services
docker-compose down

# Start services (model will be pulled automatically)
docker-compose up -d

# Watch the download progress
docker-compose logs -f ollama-setup

Note: First startup with a new model takes 2-10 minutes depending on model size.

Supported Models

Recommended Models

Model	Size	Speed	Quality	Best For
`phi3:latest`	2.3GB	⚡⚡⚡	⭐⭐⭐	Default - Fast, good quality
`llama3:8b`	4.7GB	⚡⚡	⭐⭐⭐⭐	Better quality, slower
`mistral:7b`	4.1GB	⚡⚡	⭐⭐⭐⭐	Balanced performance
`gemma:7b`	5.0GB	⚡⚡	⭐⭐⭐⭐	Google's model

Lightweight Models (Faster)

Model	Size	Speed	Quality
`phi3:mini`	2.3GB	⚡⚡⚡	⭐⭐⭐
`tinyllama:latest`	637MB	⚡⚡⚡⚡	⭐⭐
`qwen:0.5b`	397MB	⚡⚡⚡⚡	⭐⭐

High-Quality Models (Slower)

Model	Size	Speed	Quality
`llama3:70b`	40GB	⚡	⭐⭐⭐⭐⭐
`mixtral:8x7b`	26GB	⚡	⭐⭐⭐⭐⭐

Full list: https://ollama.ai/library

Manual Model Management

Pull Model Manually

# Pull a specific model
docker-compose exec ollama ollama pull llama3:latest

# Pull multiple models
docker-compose exec ollama ollama pull mistral:7b
docker-compose exec ollama ollama pull phi3:latest

List Available Models

docker-compose exec ollama ollama list

Remove Unused Models

# Remove a specific model
docker-compose exec ollama ollama rm phi3:latest

# Free up space
docker-compose exec ollama ollama prune

Testing the New Model

Test via API

curl http://localhost:5001/api/ollama/test

Test Summarization

docker-compose exec crawler python << 'EOF'
from ollama_client import OllamaClient
from config import Config

client = OllamaClient(
    base_url=Config.OLLAMA_BASE_URL,
    model=Config.OLLAMA_MODEL,
    enabled=True
)

result = client.summarize_article(
    "This is a test article about Munich news. The city council made important decisions today.",
    max_words=50
)

print(f"Model: {Config.OLLAMA_MODEL}")
print(f"Success: {result['success']}")
print(f"Summary: {result['summary']}")
print(f"Duration: {result['duration']:.2f}s")
EOF

Test Clustering

docker-compose exec crawler python tests/crawler/test_clustering_real.py

Performance Comparison

Summarization Speed (per article)

Model	CPU	GPU (NVIDIA)
phi3:latest	~15s	~3s
llama3:8b	~25s	~5s
mistral:7b	~20s	~4s
llama3:70b	~120s	~15s

Memory Requirements

Model	RAM	VRAM (GPU)
phi3:latest	4GB	2GB
llama3:8b	8GB	4GB
mistral:7b	8GB	4GB
llama3:70b	48GB	40GB

Troubleshooting

Model Not Found

# Check if model exists
docker-compose exec ollama ollama list

# Pull the model manually
docker-compose exec ollama ollama pull your-model:latest

Out of Memory

If you get OOM errors:

Use a smaller model (e.g., phi3:mini)
Enable GPU acceleration (see GPU_SETUP.md)
Increase Docker memory limit

Slow Performance

Use GPU acceleration - 5-10x faster
Use smaller model - phi3:latest is fastest
Increase timeout in .env:
```
OLLAMA_TIMEOUT=300
```

Model Download Fails

# Check Ollama logs
docker-compose logs ollama

# Restart Ollama
docker-compose restart ollama

# Try manual pull
docker-compose exec ollama ollama pull phi3:latest

Custom Models

Using Your Own Model

Create/fine-tune your model using Ollama

Import it:

docker-compose exec ollama ollama create my-model -f Modelfile

Update .env:
```
OLLAMA_MODEL=my-model:latest
```
Restart services

Model Requirements

Your custom model should support:

Text generation
Prompt-based instructions
Reasonable response times (<60s per request)

Best Practices

For Production

Test thoroughly before switching models
Monitor performance after switching
Keep backup of old model until stable
Document model choice in your deployment notes

For Development

Use phi3:latest for fast iteration
Test with llama3:8b for quality checks
Profile performance with different models
Compare results between models

FAQ

Q: Can I use multiple models? A: Yes! Pull multiple models and switch by updating .env and restarting.

Q: Do I need to re-crawl articles? A: No. Existing summaries remain. New articles use the new model.

Q: Can I use OpenAI/Anthropic models? A: Not directly. Ollama only supports local models. For cloud APIs, you'd need to modify the OllamaClient class.

Q: Which model is best? A: For most users: phi3:latest (fast, good quality). For better quality: llama3:8b. For production with GPU: mistral:7b.

Q: How much disk space do I need? A: 5-10GB for small models, 50GB+ for large models. Plan accordingly.

OLLAMA_SETUP.md - Ollama installation & configuration
GPU_SETUP.md - GPU acceleration setup
AI_NEWS_AGGREGATION.md - AI features overview

6.1 KiB

Raw Blame History

Changing the AI Model

Overview

Current Configuration

✅ How to Change the Model

Step 1: Update .env File

Step 2: Restart Services

Supported Models

Recommended Models

Lightweight Models (Faster)

High-Quality Models (Slower)

Manual Model Management

Pull Model Manually

List Available Models

Remove Unused Models

Testing the New Model

Test via API

Test Summarization

Test Clustering

Performance Comparison

Summarization Speed (per article)

Memory Requirements

Troubleshooting

Model Not Found

Out of Memory

Slow Performance

Model Download Fails

Custom Models

Using Your Own Model

Model Requirements

Best Practices

For Production

For Development

FAQ

Build together

Resources

Get help

6.1 KiB Raw Blame History

Changing the AI Model

Overview

Current Configuration

✅ How to Change the Model

Step 1: Update .env File

Step 2: Restart Services

Supported Models

Recommended Models

Lightweight Models (Faster)

High-Quality Models (Slower)

Manual Model Management

Pull Model Manually

List Available Models

Remove Unused Models

Testing the New Model

Test via API

Test Summarization

Test Clustering

Performance Comparison

Summarization Speed (per article)

Memory Requirements

Troubleshooting

Model Not Found

Out of Memory

Slow Performance

Model Download Fails

Custom Models

Using Your Own Model

Model Requirements

Best Practices

For Production

For Development

FAQ

Related Documentation

Build together

Resources

Get help

6.1 KiB

Raw Blame History