145 lines
2.8 KiB
Markdown
145 lines
2.8 KiB
Markdown
# Quick Start: Ollama with GPU
|
|
|
|
## 30-Second Setup
|
|
|
|
```bash
|
|
# 1. Check GPU
|
|
./check-gpu.sh
|
|
|
|
# 2. Start services
|
|
./start-with-gpu.sh
|
|
|
|
# 3. Test
|
|
docker-compose exec crawler python crawler_service.py 2
|
|
```
|
|
|
|
## Commands Cheat Sheet
|
|
|
|
### Setup
|
|
```bash
|
|
# Check GPU availability
|
|
./check-gpu.sh
|
|
|
|
# Configure Ollama
|
|
./configure-ollama.sh
|
|
|
|
# Start with GPU auto-detection
|
|
./start-with-gpu.sh
|
|
|
|
# Start with GPU (manual)
|
|
docker-compose -f docker-compose.yml -f docker-compose.gpu.yml up -d
|
|
|
|
# Start without GPU
|
|
docker-compose up -d
|
|
```
|
|
|
|
### Monitoring
|
|
```bash
|
|
# Check GPU usage
|
|
docker exec munich-news-ollama nvidia-smi
|
|
|
|
# Monitor GPU in real-time
|
|
watch -n 1 'docker exec munich-news-ollama nvidia-smi'
|
|
|
|
# Check Ollama logs
|
|
docker-compose logs -f ollama
|
|
|
|
# Check crawler logs
|
|
docker-compose logs -f crawler
|
|
```
|
|
|
|
### Testing
|
|
```bash
|
|
# Test translation (2 articles)
|
|
docker-compose exec crawler python crawler_service.py 2
|
|
|
|
# Check translation timing
|
|
docker-compose logs crawler | grep "Title translated"
|
|
|
|
# Test Ollama API directly
|
|
curl http://localhost:11434/api/generate -d '{
|
|
"model": "phi3:latest",
|
|
"prompt": "Translate to English: Guten Morgen",
|
|
"stream": false
|
|
}'
|
|
```
|
|
|
|
### Troubleshooting
|
|
```bash
|
|
# Restart Ollama
|
|
docker-compose restart ollama
|
|
|
|
# Rebuild and restart
|
|
docker-compose up -d --build ollama
|
|
|
|
# Check GPU in container
|
|
docker exec munich-news-ollama nvidia-smi
|
|
|
|
# Pull model manually
|
|
docker-compose exec ollama ollama pull phi3:latest
|
|
|
|
# List available models
|
|
docker-compose exec ollama ollama list
|
|
```
|
|
|
|
## Performance Expectations
|
|
|
|
| Operation | CPU | GPU | Speedup |
|
|
|-----------|-----|-----|---------|
|
|
| Translation | 1.5s | 0.3s | 5x |
|
|
| Summary | 8s | 2s | 4x |
|
|
| 10 Articles | 115s | 31s | 3.7x |
|
|
|
|
## Common Issues
|
|
|
|
### GPU Not Detected
|
|
```bash
|
|
# Install NVIDIA Container Toolkit
|
|
sudo apt-get install -y nvidia-container-toolkit
|
|
sudo systemctl restart docker
|
|
```
|
|
|
|
### Out of Memory
|
|
```bash
|
|
# Use smaller model (edit backend/.env)
|
|
OLLAMA_MODEL=gemma2:2b
|
|
```
|
|
|
|
### Slow Performance
|
|
```bash
|
|
# Verify GPU is being used
|
|
docker exec munich-news-ollama nvidia-smi
|
|
# Should show GPU memory usage during inference
|
|
```
|
|
|
|
## Configuration Files
|
|
|
|
**backend/.env** - Main configuration
|
|
```env
|
|
OLLAMA_ENABLED=true
|
|
OLLAMA_BASE_URL=http://ollama:11434
|
|
OLLAMA_MODEL=phi3:latest
|
|
OLLAMA_TIMEOUT=120
|
|
```
|
|
|
|
**docker-compose.yml** - Main services
|
|
**docker-compose.gpu.yml** - GPU override
|
|
|
|
## Model Options
|
|
|
|
- `gemma2:2b` - Fastest, 1.5GB VRAM
|
|
- `phi3:latest` - Default, 3-4GB VRAM ⭐
|
|
- `llama3.2:3b` - Best quality, 5-6GB VRAM
|
|
|
|
## Full Documentation
|
|
|
|
- [OLLAMA_SETUP.md](docs/OLLAMA_SETUP.md) - Complete setup guide
|
|
- [GPU_SETUP.md](docs/GPU_SETUP.md) - GPU-specific guide
|
|
- [PERFORMANCE_COMPARISON.md](docs/PERFORMANCE_COMPARISON.md) - Benchmarks
|
|
|
|
## Need Help?
|
|
|
|
1. Run `./check-gpu.sh`
|
|
2. Check `docker-compose logs ollama`
|
|
3. See troubleshooting in [GPU_SETUP.md](docs/GPU_SETUP.md)
|