update
This commit is contained in:
144
QUICK_START_GPU.md
Normal file
144
QUICK_START_GPU.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# Quick Start: Ollama with GPU
|
||||
|
||||
## 30-Second Setup
|
||||
|
||||
```bash
|
||||
# 1. Check GPU
|
||||
./check-gpu.sh
|
||||
|
||||
# 2. Start services
|
||||
./start-with-gpu.sh
|
||||
|
||||
# 3. Test
|
||||
docker-compose exec crawler python crawler_service.py 2
|
||||
```
|
||||
|
||||
## Commands Cheat Sheet
|
||||
|
||||
### Setup
|
||||
```bash
|
||||
# Check GPU availability
|
||||
./check-gpu.sh
|
||||
|
||||
# Configure Ollama
|
||||
./configure-ollama.sh
|
||||
|
||||
# Start with GPU auto-detection
|
||||
./start-with-gpu.sh
|
||||
|
||||
# Start with GPU (manual)
|
||||
docker-compose -f docker-compose.yml -f docker-compose.gpu.yml up -d
|
||||
|
||||
# Start without GPU
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
### Monitoring
|
||||
```bash
|
||||
# Check GPU usage
|
||||
docker exec munich-news-ollama nvidia-smi
|
||||
|
||||
# Monitor GPU in real-time
|
||||
watch -n 1 'docker exec munich-news-ollama nvidia-smi'
|
||||
|
||||
# Check Ollama logs
|
||||
docker-compose logs -f ollama
|
||||
|
||||
# Check crawler logs
|
||||
docker-compose logs -f crawler
|
||||
```
|
||||
|
||||
### Testing
|
||||
```bash
|
||||
# Test translation (2 articles)
|
||||
docker-compose exec crawler python crawler_service.py 2
|
||||
|
||||
# Check translation timing
|
||||
docker-compose logs crawler | grep "Title translated"
|
||||
|
||||
# Test Ollama API directly
|
||||
curl http://localhost:11434/api/generate -d '{
|
||||
"model": "phi3:latest",
|
||||
"prompt": "Translate to English: Guten Morgen",
|
||||
"stream": false
|
||||
}'
|
||||
```
|
||||
|
||||
### Troubleshooting
|
||||
```bash
|
||||
# Restart Ollama
|
||||
docker-compose restart ollama
|
||||
|
||||
# Rebuild and restart
|
||||
docker-compose up -d --build ollama
|
||||
|
||||
# Check GPU in container
|
||||
docker exec munich-news-ollama nvidia-smi
|
||||
|
||||
# Pull model manually
|
||||
docker-compose exec ollama ollama pull phi3:latest
|
||||
|
||||
# List available models
|
||||
docker-compose exec ollama ollama list
|
||||
```
|
||||
|
||||
## Performance Expectations
|
||||
|
||||
| Operation | CPU | GPU | Speedup |
|
||||
|-----------|-----|-----|---------|
|
||||
| Translation | 1.5s | 0.3s | 5x |
|
||||
| Summary | 8s | 2s | 4x |
|
||||
| 10 Articles | 115s | 31s | 3.7x |
|
||||
|
||||
## Common Issues
|
||||
|
||||
### GPU Not Detected
|
||||
```bash
|
||||
# Install NVIDIA Container Toolkit
|
||||
sudo apt-get install -y nvidia-container-toolkit
|
||||
sudo systemctl restart docker
|
||||
```
|
||||
|
||||
### Out of Memory
|
||||
```bash
|
||||
# Use smaller model (edit backend/.env)
|
||||
OLLAMA_MODEL=gemma2:2b
|
||||
```
|
||||
|
||||
### Slow Performance
|
||||
```bash
|
||||
# Verify GPU is being used
|
||||
docker exec munich-news-ollama nvidia-smi
|
||||
# Should show GPU memory usage during inference
|
||||
```
|
||||
|
||||
## Configuration Files
|
||||
|
||||
**backend/.env** - Main configuration
|
||||
```env
|
||||
OLLAMA_ENABLED=true
|
||||
OLLAMA_BASE_URL=http://ollama:11434
|
||||
OLLAMA_MODEL=phi3:latest
|
||||
OLLAMA_TIMEOUT=120
|
||||
```
|
||||
|
||||
**docker-compose.yml** - Main services
|
||||
**docker-compose.gpu.yml** - GPU override
|
||||
|
||||
## Model Options
|
||||
|
||||
- `gemma2:2b` - Fastest, 1.5GB VRAM
|
||||
- `phi3:latest` - Default, 3-4GB VRAM ⭐
|
||||
- `llama3.2:3b` - Best quality, 5-6GB VRAM
|
||||
|
||||
## Full Documentation
|
||||
|
||||
- [OLLAMA_SETUP.md](docs/OLLAMA_SETUP.md) - Complete setup guide
|
||||
- [GPU_SETUP.md](docs/GPU_SETUP.md) - GPU-specific guide
|
||||
- [PERFORMANCE_COMPARISON.md](docs/PERFORMANCE_COMPARISON.md) - Benchmarks
|
||||
|
||||
## Need Help?
|
||||
|
||||
1. Run `./check-gpu.sh`
|
||||
2. Check `docker-compose logs ollama`
|
||||
3. See troubleshooting in [GPU_SETUP.md](docs/GPU_SETUP.md)
|
||||
Reference in New Issue
Block a user