Files
Munich-news/QUICK_START_GPU.md
2025-11-11 17:20:56 +01:00

2.8 KiB

Quick Start: Ollama with GPU

30-Second Setup

# 1. Check GPU
./check-gpu.sh

# 2. Start services
./start-with-gpu.sh

# 3. Test
docker-compose exec crawler python crawler_service.py 2

Commands Cheat Sheet

Setup

# Check GPU availability
./check-gpu.sh

# Configure Ollama
./configure-ollama.sh

# Start with GPU auto-detection
./start-with-gpu.sh

# Start with GPU (manual)
docker-compose -f docker-compose.yml -f docker-compose.gpu.yml up -d

# Start without GPU
docker-compose up -d

Monitoring

# Check GPU usage
docker exec munich-news-ollama nvidia-smi

# Monitor GPU in real-time
watch -n 1 'docker exec munich-news-ollama nvidia-smi'

# Check Ollama logs
docker-compose logs -f ollama

# Check crawler logs
docker-compose logs -f crawler

Testing

# Test translation (2 articles)
docker-compose exec crawler python crawler_service.py 2

# Check translation timing
docker-compose logs crawler | grep "Title translated"

# Test Ollama API directly
curl http://localhost:11434/api/generate -d '{
  "model": "phi3:latest",
  "prompt": "Translate to English: Guten Morgen",
  "stream": false
}'

Troubleshooting

# Restart Ollama
docker-compose restart ollama

# Rebuild and restart
docker-compose up -d --build ollama

# Check GPU in container
docker exec munich-news-ollama nvidia-smi

# Pull model manually
docker-compose exec ollama ollama pull phi3:latest

# List available models
docker-compose exec ollama ollama list

Performance Expectations

Operation CPU GPU Speedup
Translation 1.5s 0.3s 5x
Summary 8s 2s 4x
10 Articles 115s 31s 3.7x

Common Issues

GPU Not Detected

# Install NVIDIA Container Toolkit
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

Out of Memory

# Use smaller model (edit backend/.env)
OLLAMA_MODEL=gemma2:2b

Slow Performance

# Verify GPU is being used
docker exec munich-news-ollama nvidia-smi
# Should show GPU memory usage during inference

Configuration Files

backend/.env - Main configuration

OLLAMA_ENABLED=true
OLLAMA_BASE_URL=http://ollama:11434
OLLAMA_MODEL=phi3:latest
OLLAMA_TIMEOUT=120

docker-compose.yml - Main services docker-compose.gpu.yml - GPU override

Model Options

  • gemma2:2b - Fastest, 1.5GB VRAM
  • phi3:latest - Default, 3-4GB VRAM
  • llama3.2:3b - Best quality, 5-6GB VRAM

Full Documentation

Need Help?

  1. Run ./check-gpu.sh
  2. Check docker-compose logs ollama
  3. See troubleshooting in GPU_SETUP.md