diff --git a/docker-compose.yml b/docker-compose.yml index 501d493..d42fd45 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -50,15 +50,17 @@ services: condition: service_healthy networks: - munich-news-network + env_file: + - backend/.env entrypoint: /bin/sh command: > -c " echo 'Waiting for Ollama service to be ready...' && sleep 5 && - echo 'Pulling phi3:latest model via API...' && - curl -X POST http://ollama:11434/api/pull -d '{\"name\":\"phi3:latest\"}' && + echo 'Pulling model: ${OLLAMA_MODEL:-phi3:latest}' && + curl -X POST http://ollama:11434/api/pull -d '{\"name\":\"${OLLAMA_MODEL:-phi3:latest}\"}' && echo '' && - echo 'Model phi3:latest pull initiated!' + echo 'Model ${OLLAMA_MODEL:-phi3:latest} pull initiated!' " restart: "no" diff --git a/docs/CHANGING_AI_MODEL.md b/docs/CHANGING_AI_MODEL.md new file mode 100644 index 0000000..187ee77 --- /dev/null +++ b/docs/CHANGING_AI_MODEL.md @@ -0,0 +1,266 @@ +# Changing the AI Model + +## Overview + +The system uses Ollama for AI-powered features (summarization, clustering, neutral summaries). You can easily change the model by updating the `.env` file. + +## Current Configuration + +**Default Model:** `phi3:latest` + +The model is configured in `backend/.env`: +```env +OLLAMA_MODEL=phi3:latest +``` + +## ✅ How to Change the Model + +### Step 1: Update .env File + +Edit `backend/.env` and change the `OLLAMA_MODEL` value: + +```env +# Example: Change to a different model +OLLAMA_MODEL=llama3:latest + +# Or use a specific version +OLLAMA_MODEL=mistral:7b + +# Or use a custom model +OLLAMA_MODEL=your-custom-model:latest +``` + +### Step 2: Restart Services + +The model will be automatically downloaded on startup: + +```bash +# Stop services +docker-compose down + +# Start services (model will be pulled automatically) +docker-compose up -d + +# Watch the download progress +docker-compose logs -f ollama-setup +``` + +**Note:** First startup with a new model takes 2-10 minutes depending on model size. + +## Supported Models + +### Recommended Models + +| Model | Size | Speed | Quality | Best For | +|-------|------|-------|---------|----------| +| `phi3:latest` | 2.3GB | ⚡⚡⚡ | ⭐⭐⭐ | **Default** - Fast, good quality | +| `llama3:8b` | 4.7GB | ⚡⚡ | ⭐⭐⭐⭐ | Better quality, slower | +| `mistral:7b` | 4.1GB | ⚡⚡ | ⭐⭐⭐⭐ | Balanced performance | +| `gemma:7b` | 5.0GB | ⚡⚡ | ⭐⭐⭐⭐ | Google's model | + +### Lightweight Models (Faster) + +| Model | Size | Speed | Quality | +|-------|------|-------|---------| +| `phi3:mini` | 2.3GB | ⚡⚡⚡ | ⭐⭐⭐ | +| `tinyllama:latest` | 637MB | ⚡⚡⚡⚡ | ⭐⭐ | +| `qwen:0.5b` | 397MB | ⚡⚡⚡⚡ | ⭐⭐ | + +### High-Quality Models (Slower) + +| Model | Size | Speed | Quality | +|-------|------|-------|---------| +| `llama3:70b` | 40GB | ⚡ | ⭐⭐⭐⭐⭐ | +| `mixtral:8x7b` | 26GB | ⚡ | ⭐⭐⭐⭐⭐ | + +**Full list:** https://ollama.ai/library + +## Manual Model Management + +### Pull Model Manually + +```bash +# Pull a specific model +docker-compose exec ollama ollama pull llama3:latest + +# Pull multiple models +docker-compose exec ollama ollama pull mistral:7b +docker-compose exec ollama ollama pull phi3:latest +``` + +### List Available Models + +```bash +docker-compose exec ollama ollama list +``` + +### Remove Unused Models + +```bash +# Remove a specific model +docker-compose exec ollama ollama rm phi3:latest + +# Free up space +docker-compose exec ollama ollama prune +``` + +## Testing the New Model + +### Test via API + +```bash +curl http://localhost:5001/api/ollama/test +``` + +### Test Summarization + +```bash +docker-compose exec crawler python << 'EOF' +from ollama_client import OllamaClient +from config import Config + +client = OllamaClient( + base_url=Config.OLLAMA_BASE_URL, + model=Config.OLLAMA_MODEL, + enabled=True +) + +result = client.summarize_article( + "This is a test article about Munich news. The city council made important decisions today.", + max_words=50 +) + +print(f"Model: {Config.OLLAMA_MODEL}") +print(f"Success: {result['success']}") +print(f"Summary: {result['summary']}") +print(f"Duration: {result['duration']:.2f}s") +EOF +``` + +### Test Clustering + +```bash +docker-compose exec crawler python tests/crawler/test_clustering_real.py +``` + +## Performance Comparison + +### Summarization Speed (per article) + +| Model | CPU | GPU (NVIDIA) | +|-------|-----|--------------| +| phi3:latest | ~15s | ~3s | +| llama3:8b | ~25s | ~5s | +| mistral:7b | ~20s | ~4s | +| llama3:70b | ~120s | ~15s | + +### Memory Requirements + +| Model | RAM | VRAM (GPU) | +|-------|-----|------------| +| phi3:latest | 4GB | 2GB | +| llama3:8b | 8GB | 4GB | +| mistral:7b | 8GB | 4GB | +| llama3:70b | 48GB | 40GB | + +## Troubleshooting + +### Model Not Found + +```bash +# Check if model exists +docker-compose exec ollama ollama list + +# Pull the model manually +docker-compose exec ollama ollama pull your-model:latest +``` + +### Out of Memory + +If you get OOM errors: +1. Use a smaller model (e.g., `phi3:mini`) +2. Enable GPU acceleration (see [GPU_SETUP.md](GPU_SETUP.md)) +3. Increase Docker memory limit + +### Slow Performance + +1. **Use GPU acceleration** - 5-10x faster +2. **Use smaller model** - phi3:latest is fastest +3. **Increase timeout** in `.env`: + ```env + OLLAMA_TIMEOUT=300 + ``` + +### Model Download Fails + +```bash +# Check Ollama logs +docker-compose logs ollama + +# Restart Ollama +docker-compose restart ollama + +# Try manual pull +docker-compose exec ollama ollama pull phi3:latest +``` + +## Custom Models + +### Using Your Own Model + +1. **Create/fine-tune your model** using Ollama +2. **Import it:** + ```bash + docker-compose exec ollama ollama create my-model -f Modelfile + ``` +3. **Update .env:** + ```env + OLLAMA_MODEL=my-model:latest + ``` +4. **Restart services** + +### Model Requirements + +Your custom model should support: +- Text generation +- Prompt-based instructions +- Reasonable response times (<60s per request) + +## Best Practices + +### For Production + +1. **Test thoroughly** before switching models +2. **Monitor performance** after switching +3. **Keep backup** of old model until stable +4. **Document** model choice in your deployment notes + +### For Development + +1. **Use phi3:latest** for fast iteration +2. **Test with llama3:8b** for quality checks +3. **Profile performance** with different models +4. **Compare results** between models + +## FAQ + +**Q: Can I use multiple models?** +A: Yes! Pull multiple models and switch by updating `.env` and restarting. + +**Q: Do I need to re-crawl articles?** +A: No. Existing summaries remain. New articles use the new model. + +**Q: Can I use OpenAI/Anthropic models?** +A: Not directly. Ollama only supports local models. For cloud APIs, you'd need to modify the `OllamaClient` class. + +**Q: Which model is best?** +A: For most users: `phi3:latest` (fast, good quality). For better quality: `llama3:8b`. For production with GPU: `mistral:7b`. + +**Q: How much disk space do I need?** +A: 5-10GB for small models, 50GB+ for large models. Plan accordingly. + +## Related Documentation + +- [OLLAMA_SETUP.md](OLLAMA_SETUP.md) - Ollama installation & configuration +- [GPU_SETUP.md](GPU_SETUP.md) - GPU acceleration setup +- [AI_NEWS_AGGREGATION.md](AI_NEWS_AGGREGATION.md) - AI features overview