2.9 KiB
2.9 KiB
Quick Start: Ollama with GPU
30-Second Setup
# 1. Check GPU
./check-gpu.sh
# 2. Start services
./start-with-gpu.sh
# 3. Test
docker-compose exec crawler python crawler_service.py 2
Commands Cheat Sheet
Setup
# Check GPU availability
./check-gpu.sh
# Configure Ollama
./configure-ollama.sh
# Start with GPU auto-detection
./start-with-gpu.sh
# Start with GPU (manual)
docker-compose -f docker-compose.yml -f docker-compose.gpu.yml up -d
# Start without GPU
docker-compose up -d
Monitoring
# Check GPU usage
docker exec munich-news-ollama nvidia-smi
# Monitor GPU in real-time
watch -n 1 'docker exec munich-news-ollama nvidia-smi'
# Check Ollama logs
docker-compose logs -f ollama
# Check crawler logs
docker-compose logs -f crawler
Testing
# Test translation (2 articles)
docker-compose exec crawler python crawler_service.py 2
# Check translation timing
docker-compose logs crawler | grep "Title translated"
# Test Ollama API (internal network only)
docker-compose exec crawler curl -s http://ollama:11434/api/generate -d '{
"model": "phi3:latest",
"prompt": "Translate to English: Guten Morgen",
"stream": false
}'
Troubleshooting
# Restart Ollama
docker-compose restart ollama
# Rebuild and restart
docker-compose up -d --build ollama
# Check GPU in container
docker exec munich-news-ollama nvidia-smi
# Pull model manually
docker-compose exec ollama ollama pull phi3:latest
# List available models
docker-compose exec ollama ollama list
Performance Expectations
| Operation | CPU | GPU | Speedup |
|---|---|---|---|
| Translation | 1.5s | 0.3s | 5x |
| Summary | 8s | 2s | 4x |
| 10 Articles | 115s | 31s | 3.7x |
Common Issues
GPU Not Detected
# Install NVIDIA Container Toolkit
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
Out of Memory
# Use smaller model (edit backend/.env)
OLLAMA_MODEL=gemma2:2b
Slow Performance
# Verify GPU is being used
docker exec munich-news-ollama nvidia-smi
# Should show GPU memory usage during inference
Configuration Files
backend/.env - Main configuration
OLLAMA_ENABLED=true
OLLAMA_BASE_URL=http://ollama:11434
OLLAMA_MODEL=phi3:latest
OLLAMA_TIMEOUT=120
docker-compose.yml - Main services docker-compose.gpu.yml - GPU override
Model Options
gemma2:2b- Fastest, 1.5GB VRAMphi3:latest- Default, 3-4GB VRAM ⭐llama3.2:3b- Best quality, 5-6GB VRAM
Full Documentation
- OLLAMA_SETUP.md - Complete setup guide
- GPU_SETUP.md - GPU-specific guide
- PERFORMANCE_COMPARISON.md - Benchmarks
Need Help?
- Run
./check-gpu.sh - Check
docker-compose logs ollama - See troubleshooting in GPU_SETUP.md