update
This commit is contained in:
@@ -330,3 +330,53 @@ def trigger_crawl():
|
||||
- **[Newsletter Preview](../backend/routes/newsletter_routes.py)**: `/api/newsletter/preview` - Preview newsletter HTML
|
||||
- **[Analytics](API.md)**: `/api/analytics/*` - View engagement metrics
|
||||
- **[RSS Feeds](API.md)**: `/api/rss-feeds` - Manage RSS feeds
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Newsletter API Summary
|
||||
|
||||
### Available Endpoints
|
||||
|
||||
| Endpoint | Purpose | Recipient |
|
||||
|----------|---------|-----------|
|
||||
| `/api/admin/send-test-email` | Test newsletter | Single email (specified) |
|
||||
| `/api/admin/send-newsletter` | Production send | All active subscribers |
|
||||
| `/api/admin/trigger-crawl` | Fetch articles | N/A |
|
||||
| `/api/admin/stats` | System stats | N/A |
|
||||
|
||||
### Subscriber Status
|
||||
|
||||
The system uses a `status` field to determine who receives newsletters:
|
||||
- **`active`** - Receives newsletters ✅
|
||||
- **`inactive`** - Does not receive newsletters ❌
|
||||
|
||||
See [SUBSCRIBER_STATUS.md](SUBSCRIBER_STATUS.md) for details.
|
||||
|
||||
### Quick Examples
|
||||
|
||||
**Send to all subscribers:**
|
||||
```bash
|
||||
curl -X POST http://localhost:5001/api/admin/send-newsletter \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"max_articles": 10}'
|
||||
```
|
||||
|
||||
**Send test email:**
|
||||
```bash
|
||||
curl -X POST http://localhost:5001/api/admin/send-test-email \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"email": "test@example.com"}'
|
||||
```
|
||||
|
||||
**Check stats:**
|
||||
```bash
|
||||
curl http://localhost:5001/api/admin/stats | jq '.subscribers'
|
||||
```
|
||||
|
||||
### Testing
|
||||
|
||||
Use the test script:
|
||||
```bash
|
||||
./test-newsletter-api.sh
|
||||
```
|
||||
|
||||
@@ -134,3 +134,43 @@ Root:
|
||||
- [ ] API rate limiting
|
||||
- [ ] Caching layer (Redis)
|
||||
- [ ] Message queue for crawler (Celery)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Recent Updates (November 2025)
|
||||
|
||||
### Security Improvements
|
||||
- **MongoDB Internal-Only**: Removed port exposure, only accessible via Docker network
|
||||
- **Ollama Internal-Only**: Removed port exposure, only accessible via Docker network
|
||||
- **Reduced Attack Surface**: Only Backend API (port 5001) exposed to host
|
||||
- **Network Isolation**: All services communicate via internal Docker network
|
||||
|
||||
### Ollama Integration
|
||||
- **Docker Compose Integration**: Ollama service runs alongside other services
|
||||
- **Automatic Model Download**: phi3:latest model downloaded on first startup
|
||||
- **GPU Support**: NVIDIA GPU acceleration with automatic detection
|
||||
- **Helper Scripts**: `start-with-gpu.sh`, `check-gpu.sh`, `configure-ollama.sh`
|
||||
- **Performance**: 5-10x faster with GPU acceleration
|
||||
|
||||
### API Enhancements
|
||||
- **Send Newsletter Endpoint**: `/api/admin/send-newsletter` to send to all active subscribers
|
||||
- **Subscriber Status Fix**: Fixed stats endpoint to correctly count active subscribers
|
||||
- **Better Error Handling**: Improved error messages and validation
|
||||
|
||||
### Documentation
|
||||
- **Consolidated Documentation**: Moved all docs to `docs/` directory
|
||||
- **Security Guide**: Comprehensive security documentation
|
||||
- **GPU Setup Guide**: Detailed GPU acceleration setup
|
||||
- **MongoDB Connection Guide**: Connection configuration explained
|
||||
- **Subscriber Status Guide**: How subscriber status system works
|
||||
|
||||
### Configuration
|
||||
- **MongoDB URI**: Updated to use Docker service name (`mongodb` instead of `localhost`)
|
||||
- **Ollama URL**: Configured for internal Docker network (`http://ollama:11434`)
|
||||
- **Single .env File**: All configuration in `backend/.env`
|
||||
|
||||
### Testing
|
||||
- **Connectivity Tests**: `test-mongodb-connectivity.sh`
|
||||
- **Ollama Tests**: `test-ollama-setup.sh`
|
||||
- **Newsletter API Tests**: `test-newsletter-api.sh`
|
||||
|
||||
@@ -269,3 +269,68 @@ db.articles.find({ summary: { $exists: false } })
|
||||
// Count summarized articles
|
||||
db.articles.countDocuments({ summary: { $exists: true, $ne: null } })
|
||||
```
|
||||
|
||||
|
||||
---
|
||||
|
||||
## MongoDB Connection Configuration
|
||||
|
||||
### Docker Compose Setup
|
||||
|
||||
**Connection URI:**
|
||||
```env
|
||||
MONGODB_URI=mongodb://admin:changeme@mongodb:27017/
|
||||
```
|
||||
|
||||
**Key Points:**
|
||||
- Uses `mongodb` (Docker service name), not `localhost`
|
||||
- Includes authentication credentials
|
||||
- Only works inside Docker network
|
||||
- Port 27017 is NOT exposed to host (internal only)
|
||||
|
||||
### Why 'mongodb' Instead of 'localhost'?
|
||||
|
||||
**Inside Docker containers:**
|
||||
```
|
||||
Container → mongodb:27017 ✅ Works (Docker DNS)
|
||||
Container → localhost:27017 ❌ Fails (localhost = container itself)
|
||||
```
|
||||
|
||||
**From host machine:**
|
||||
```
|
||||
Host → localhost:27017 ❌ Blocked (port not exposed)
|
||||
Host → mongodb:27017 ❌ Fails (DNS only works in Docker)
|
||||
```
|
||||
|
||||
### Connection Priority
|
||||
|
||||
1. **Docker Compose environment variables** (highest)
|
||||
2. **.env file** (fallback)
|
||||
3. **Code defaults** (lowest)
|
||||
|
||||
### Testing Connection
|
||||
|
||||
```bash
|
||||
# From backend
|
||||
docker-compose exec backend python -c "
|
||||
from database import articles_collection
|
||||
print(f'Articles: {articles_collection.count_documents({})}')
|
||||
"
|
||||
|
||||
# From crawler
|
||||
docker-compose exec crawler python -c "
|
||||
from pymongo import MongoClient
|
||||
from config import Config
|
||||
client = MongoClient(Config.MONGODB_URI)
|
||||
print(f'MongoDB version: {client.server_info()[\"version\"]}')
|
||||
"
|
||||
```
|
||||
|
||||
### Security
|
||||
|
||||
- ✅ MongoDB is internal-only (not exposed to host)
|
||||
- ✅ Uses authentication (username/password)
|
||||
- ✅ Only accessible via Docker network
|
||||
- ✅ Cannot be accessed from external network
|
||||
|
||||
See [SECURITY_NOTES.md](SECURITY_NOTES.md) for more security details.
|
||||
|
||||
204
docs/DOCUMENTATION_CLEANUP.md
Normal file
204
docs/DOCUMENTATION_CLEANUP.md
Normal file
@@ -0,0 +1,204 @@
|
||||
# Documentation Cleanup Summary
|
||||
|
||||
## What Was Done
|
||||
|
||||
Consolidated and organized all markdown documentation files.
|
||||
|
||||
## Before
|
||||
|
||||
**Root Level:** 14 markdown files (cluttered)
|
||||
```
|
||||
README.md
|
||||
QUICKSTART.md
|
||||
CONTRIBUTING.md
|
||||
IMPLEMENTATION_SUMMARY.md
|
||||
MONGODB_CONNECTION_EXPLAINED.md
|
||||
NETWORK_SECURITY_SUMMARY.md
|
||||
NEWSLETTER_API_UPDATE.md
|
||||
OLLAMA_GPU_SUMMARY.md
|
||||
OLLAMA_INTEGRATION.md
|
||||
QUICK_START_GPU.md
|
||||
SECURITY_IMPROVEMENTS.md
|
||||
SECURITY_UPDATE.md
|
||||
FINAL_STRUCTURE.md (outdated)
|
||||
PROJECT_STRUCTURE.md (redundant)
|
||||
```
|
||||
|
||||
**docs/:** 18 files (organized but some content duplicated)
|
||||
|
||||
## After
|
||||
|
||||
**Root Level:** 3 essential files (clean)
|
||||
```
|
||||
README.md - Main entry point
|
||||
QUICKSTART.md - Quick setup guide
|
||||
CONTRIBUTING.md - Contribution guidelines
|
||||
```
|
||||
|
||||
**docs/:** 19 files (organized, consolidated, no duplication)
|
||||
```
|
||||
INDEX.md - Documentation index (NEW)
|
||||
ADMIN_API.md - Admin API (consolidated)
|
||||
API.md
|
||||
ARCHITECTURE.md
|
||||
BACKEND_STRUCTURE.md
|
||||
CHANGELOG.md - Updated with recent changes
|
||||
CRAWLER_HOW_IT_WORKS.md
|
||||
DATABASE_SCHEMA.md - Added MongoDB connection info
|
||||
DEPLOYMENT.md
|
||||
EXTRACTION_STRATEGIES.md
|
||||
GPU_SETUP.md - Consolidated GPU docs
|
||||
OLLAMA_SETUP.md - Consolidated Ollama docs
|
||||
OLD_ARCHITECTURE.md
|
||||
PERFORMANCE_COMPARISON.md
|
||||
QUICK_REFERENCE.md
|
||||
RSS_URL_EXTRACTION.md
|
||||
SECURITY_NOTES.md - Consolidated all security docs
|
||||
SUBSCRIBER_STATUS.md
|
||||
SYSTEM_ARCHITECTURE.md
|
||||
```
|
||||
|
||||
## Changes Made
|
||||
|
||||
### 1. Deleted Redundant Files
|
||||
- ❌ `FINAL_STRUCTURE.md` (outdated)
|
||||
- ❌ `PROJECT_STRUCTURE.md` (redundant with README)
|
||||
|
||||
### 2. Merged into docs/SECURITY_NOTES.md
|
||||
- ✅ `SECURITY_UPDATE.md` (Ollama security)
|
||||
- ✅ `SECURITY_IMPROVEMENTS.md` (Network isolation)
|
||||
- ✅ `NETWORK_SECURITY_SUMMARY.md` (Port exposure summary)
|
||||
|
||||
### 3. Merged into docs/GPU_SETUP.md
|
||||
- ✅ `OLLAMA_GPU_SUMMARY.md` (GPU implementation summary)
|
||||
- ✅ `QUICK_START_GPU.md` (Quick start commands)
|
||||
|
||||
### 4. Merged into docs/OLLAMA_SETUP.md
|
||||
- ✅ `OLLAMA_INTEGRATION.md` (Integration details)
|
||||
|
||||
### 5. Merged into docs/ADMIN_API.md
|
||||
- ✅ `NEWSLETTER_API_UPDATE.md` (Newsletter endpoint)
|
||||
|
||||
### 6. Merged into docs/DATABASE_SCHEMA.md
|
||||
- ✅ `MONGODB_CONNECTION_EXPLAINED.md` (Connection config)
|
||||
|
||||
### 7. Merged into docs/CHANGELOG.md
|
||||
- ✅ `IMPLEMENTATION_SUMMARY.md` (Recent updates)
|
||||
|
||||
### 8. Created New Files
|
||||
- ✨ `docs/INDEX.md` - Complete documentation index
|
||||
|
||||
### 9. Updated Existing Files
|
||||
- 📝 `README.md` - Added documentation section
|
||||
- 📝 `docs/CHANGELOG.md` - Added recent updates
|
||||
- 📝 `docs/SECURITY_NOTES.md` - Comprehensive security guide
|
||||
- 📝 `docs/GPU_SETUP.md` - Complete GPU guide
|
||||
- 📝 `docs/OLLAMA_SETUP.md` - Complete Ollama guide
|
||||
- 📝 `docs/ADMIN_API.md` - Complete API reference
|
||||
- 📝 `docs/DATABASE_SCHEMA.md` - Added connection info
|
||||
|
||||
## Benefits
|
||||
|
||||
### 1. Cleaner Root Directory
|
||||
- Only 3 essential files visible
|
||||
- Easier to navigate
|
||||
- Professional appearance
|
||||
|
||||
### 2. Better Organization
|
||||
- All technical docs in `docs/`
|
||||
- Logical grouping by topic
|
||||
- Easy to find information
|
||||
|
||||
### 3. No Duplication
|
||||
- Consolidated related content
|
||||
- Single source of truth
|
||||
- Easier to maintain
|
||||
|
||||
### 4. Improved Discoverability
|
||||
- Documentation index (`docs/INDEX.md`)
|
||||
- Clear navigation
|
||||
- Quick links by task
|
||||
|
||||
### 5. Better Maintenance
|
||||
- Fewer files to update
|
||||
- Related content together
|
||||
- Clear structure
|
||||
|
||||
## Documentation Structure
|
||||
|
||||
```
|
||||
project/
|
||||
├── README.md # Main entry point
|
||||
├── QUICKSTART.md # Quick setup
|
||||
├── CONTRIBUTING.md # How to contribute
|
||||
│
|
||||
└── docs/ # All technical documentation
|
||||
├── INDEX.md # Documentation index
|
||||
│
|
||||
├── Setup & Configuration
|
||||
│ ├── OLLAMA_SETUP.md
|
||||
│ ├── GPU_SETUP.md
|
||||
│ └── DEPLOYMENT.md
|
||||
│
|
||||
├── API Documentation
|
||||
│ ├── ADMIN_API.md
|
||||
│ ├── API.md
|
||||
│ └── SUBSCRIBER_STATUS.md
|
||||
│
|
||||
├── Architecture
|
||||
│ ├── SYSTEM_ARCHITECTURE.md
|
||||
│ ├── ARCHITECTURE.md
|
||||
│ ├── DATABASE_SCHEMA.md
|
||||
│ └── BACKEND_STRUCTURE.md
|
||||
│
|
||||
├── Features
|
||||
│ ├── CRAWLER_HOW_IT_WORKS.md
|
||||
│ ├── EXTRACTION_STRATEGIES.md
|
||||
│ ├── RSS_URL_EXTRACTION.md
|
||||
│ └── PERFORMANCE_COMPARISON.md
|
||||
│
|
||||
├── Security
|
||||
│ └── SECURITY_NOTES.md
|
||||
│
|
||||
└── Reference
|
||||
├── CHANGELOG.md
|
||||
└── QUICK_REFERENCE.md
|
||||
```
|
||||
|
||||
## Quick Access
|
||||
|
||||
### For Users
|
||||
- Start here: [README.md](README.md)
|
||||
- Quick setup: [QUICKSTART.md](QUICKSTART.md)
|
||||
- All docs: [docs/INDEX.md](docs/INDEX.md)
|
||||
|
||||
### For Developers
|
||||
- Architecture: [docs/SYSTEM_ARCHITECTURE.md](docs/SYSTEM_ARCHITECTURE.md)
|
||||
- API Reference: [docs/ADMIN_API.md](docs/ADMIN_API.md)
|
||||
- Contributing: [CONTRIBUTING.md](CONTRIBUTING.md)
|
||||
|
||||
### For DevOps
|
||||
- Deployment: [docs/DEPLOYMENT.md](docs/DEPLOYMENT.md)
|
||||
- Security: [docs/SECURITY_NOTES.md](docs/SECURITY_NOTES.md)
|
||||
- GPU Setup: [docs/GPU_SETUP.md](docs/GPU_SETUP.md)
|
||||
|
||||
## Statistics
|
||||
|
||||
- **Files Deleted:** 11 redundant markdown files
|
||||
- **Files Merged:** 9 files consolidated into existing docs
|
||||
- **Files Created:** 1 new index file
|
||||
- **Files Updated:** 7 existing files enhanced
|
||||
- **Root Level:** Reduced from 14 to 3 files (79% reduction)
|
||||
- **Total Docs:** 19 well-organized files in docs/
|
||||
|
||||
## Result
|
||||
|
||||
✅ Clean, professional documentation structure
|
||||
✅ Easy to navigate and find information
|
||||
✅ No duplication or redundancy
|
||||
✅ Better maintainability
|
||||
✅ Improved user experience
|
||||
|
||||
---
|
||||
|
||||
This cleanup makes the project more professional and easier to use!
|
||||
@@ -308,3 +308,113 @@ If you encounter issues:
|
||||
- Output of `nvidia-smi`
|
||||
- Output of `docker info | grep -i runtime`
|
||||
- Relevant logs
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Quick Start Guide
|
||||
|
||||
### 30-Second Setup
|
||||
|
||||
```bash
|
||||
# 1. Check GPU
|
||||
./check-gpu.sh
|
||||
|
||||
# 2. Start services
|
||||
./start-with-gpu.sh
|
||||
|
||||
# 3. Test
|
||||
docker-compose exec crawler python crawler_service.py 2
|
||||
```
|
||||
|
||||
### Command Reference
|
||||
|
||||
**Setup:**
|
||||
```bash
|
||||
./check-gpu.sh # Check GPU availability
|
||||
./configure-ollama.sh # Configure Ollama
|
||||
./start-with-gpu.sh # Start with GPU auto-detection
|
||||
```
|
||||
|
||||
**With GPU (manual):**
|
||||
```bash
|
||||
docker-compose -f docker-compose.yml -f docker-compose.gpu.yml up -d
|
||||
```
|
||||
|
||||
**Without GPU:**
|
||||
```bash
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
**Monitoring:**
|
||||
```bash
|
||||
docker exec munich-news-ollama nvidia-smi # Check GPU
|
||||
watch -n 1 'docker exec munich-news-ollama nvidia-smi' # Monitor GPU
|
||||
docker-compose logs -f ollama # Check logs
|
||||
```
|
||||
|
||||
**Testing:**
|
||||
```bash
|
||||
docker-compose exec crawler python crawler_service.py 2 # Test crawl
|
||||
docker-compose logs crawler | grep "Title translated" # Check timing
|
||||
```
|
||||
|
||||
### Performance Expectations
|
||||
|
||||
| Operation | CPU | GPU | Speedup |
|
||||
|-----------|-----|-----|---------|
|
||||
| Translation | 1.5s | 0.3s | 5x |
|
||||
| Summary | 8s | 2s | 4x |
|
||||
| 10 Articles | 115s | 31s | 3.7x |
|
||||
|
||||
---
|
||||
|
||||
## Integration Summary
|
||||
|
||||
### What Was Implemented
|
||||
|
||||
1. **Ollama Service in Docker Compose**
|
||||
- Runs on internal network (port 11434)
|
||||
- Automatic model download (phi3:latest)
|
||||
- Persistent storage in Docker volume
|
||||
- GPU support with automatic detection
|
||||
|
||||
2. **GPU Acceleration**
|
||||
- NVIDIA GPU support via docker-compose.gpu.yml
|
||||
- Automatic GPU detection script
|
||||
- 5-10x performance improvement
|
||||
- Graceful CPU fallback
|
||||
|
||||
3. **Helper Scripts**
|
||||
- `start-with-gpu.sh` - Auto-detect and start
|
||||
- `check-gpu.sh` - Diagnose GPU availability
|
||||
- `configure-ollama.sh` - Interactive configuration
|
||||
- `test-ollama-setup.sh` - Comprehensive tests
|
||||
|
||||
4. **Security**
|
||||
- Ollama is internal-only (not exposed to host)
|
||||
- Only accessible via Docker network
|
||||
- Prevents unauthorized access
|
||||
|
||||
### Files Created
|
||||
|
||||
- `docker-compose.gpu.yml` - GPU configuration override
|
||||
- `start-with-gpu.sh` - Auto-start script
|
||||
- `check-gpu.sh` - GPU detection script
|
||||
- `test-ollama-setup.sh` - Test suite
|
||||
- `docs/GPU_SETUP.md` - This documentation
|
||||
- `docs/OLLAMA_SETUP.md` - Ollama setup guide
|
||||
- `docs/PERFORMANCE_COMPARISON.md` - Benchmarks
|
||||
|
||||
### Quick Commands
|
||||
|
||||
```bash
|
||||
# Start with GPU
|
||||
docker-compose -f docker-compose.yml -f docker-compose.gpu.yml up -d
|
||||
|
||||
# Or use helper script
|
||||
./start-with-gpu.sh
|
||||
|
||||
# Verify GPU usage
|
||||
docker exec munich-news-ollama nvidia-smi
|
||||
```
|
||||
|
||||
116
docs/INDEX.md
Normal file
116
docs/INDEX.md
Normal file
@@ -0,0 +1,116 @@
|
||||
# Documentation Index
|
||||
|
||||
## Quick Start
|
||||
- [README](../README.md) - Project overview and quick start
|
||||
- [QUICKSTART](../QUICKSTART.md) - Detailed 5-minute setup guide
|
||||
|
||||
## Setup & Configuration
|
||||
- [OLLAMA_SETUP](OLLAMA_SETUP.md) - Ollama AI service setup
|
||||
- [GPU_SETUP](GPU_SETUP.md) - GPU acceleration setup (5-10x faster)
|
||||
- [DEPLOYMENT](DEPLOYMENT.md) - Production deployment guide
|
||||
|
||||
## API Documentation
|
||||
- [ADMIN_API](ADMIN_API.md) - Admin endpoints (crawl, send newsletter)
|
||||
- [API](API.md) - Public API endpoints
|
||||
- [SUBSCRIBER_STATUS](SUBSCRIBER_STATUS.md) - Subscriber status system
|
||||
|
||||
## Architecture & Design
|
||||
- [SYSTEM_ARCHITECTURE](SYSTEM_ARCHITECTURE.md) - Complete system architecture
|
||||
- [ARCHITECTURE](ARCHITECTURE.md) - High-level architecture overview
|
||||
- [DATABASE_SCHEMA](DATABASE_SCHEMA.md) - MongoDB schema and connection
|
||||
- [BACKEND_STRUCTURE](BACKEND_STRUCTURE.md) - Backend code structure
|
||||
|
||||
## Features & How-To
|
||||
- [CRAWLER_HOW_IT_WORKS](CRAWLER_HOW_IT_WORKS.md) - News crawler explained
|
||||
- [EXTRACTION_STRATEGIES](EXTRACTION_STRATEGIES.md) - Content extraction
|
||||
- [RSS_URL_EXTRACTION](RSS_URL_EXTRACTION.md) - RSS feed handling
|
||||
- [PERFORMANCE_COMPARISON](PERFORMANCE_COMPARISON.md) - CPU vs GPU benchmarks
|
||||
|
||||
## Security
|
||||
- [SECURITY_NOTES](SECURITY_NOTES.md) - Complete security guide
|
||||
- Network isolation
|
||||
- MongoDB security
|
||||
- Ollama security
|
||||
- Best practices
|
||||
|
||||
## Reference
|
||||
- [CHANGELOG](CHANGELOG.md) - Version history and recent updates
|
||||
- [QUICK_REFERENCE](QUICK_REFERENCE.md) - Command cheat sheet
|
||||
|
||||
## Contributing
|
||||
- [CONTRIBUTING](../CONTRIBUTING.md) - How to contribute
|
||||
|
||||
---
|
||||
|
||||
## Documentation Organization
|
||||
|
||||
### Root Level (3 files)
|
||||
Essential files that should be immediately visible:
|
||||
- `README.md` - Main entry point
|
||||
- `QUICKSTART.md` - Quick setup guide
|
||||
- `CONTRIBUTING.md` - Contribution guidelines
|
||||
|
||||
### docs/ Directory (18 files)
|
||||
All technical documentation organized by category:
|
||||
- **Setup**: Ollama, GPU, Deployment
|
||||
- **API**: Admin API, Public API, Subscriber system
|
||||
- **Architecture**: System design, database, backend structure
|
||||
- **Features**: Crawler, extraction, RSS handling
|
||||
- **Security**: Complete security documentation
|
||||
- **Reference**: Changelog, quick reference
|
||||
|
||||
---
|
||||
|
||||
## Quick Links by Task
|
||||
|
||||
### I want to...
|
||||
|
||||
**Set up the project:**
|
||||
1. [README](../README.md) - Overview
|
||||
2. [QUICKSTART](../QUICKSTART.md) - Step-by-step setup
|
||||
|
||||
**Enable GPU acceleration:**
|
||||
1. [GPU_SETUP](GPU_SETUP.md) - Complete GPU guide
|
||||
2. Run: `./start-with-gpu.sh`
|
||||
|
||||
**Send newsletters:**
|
||||
1. [ADMIN_API](ADMIN_API.md) - API documentation
|
||||
2. [SUBSCRIBER_STATUS](SUBSCRIBER_STATUS.md) - Subscriber system
|
||||
|
||||
**Understand the architecture:**
|
||||
1. [SYSTEM_ARCHITECTURE](SYSTEM_ARCHITECTURE.md) - Complete overview
|
||||
2. [DATABASE_SCHEMA](DATABASE_SCHEMA.md) - Database design
|
||||
|
||||
**Secure my deployment:**
|
||||
1. [SECURITY_NOTES](SECURITY_NOTES.md) - Security guide
|
||||
2. [DEPLOYMENT](DEPLOYMENT.md) - Production deployment
|
||||
|
||||
**Troubleshoot issues:**
|
||||
1. [QUICK_REFERENCE](QUICK_REFERENCE.md) - Common commands
|
||||
2. [OLLAMA_SETUP](OLLAMA_SETUP.md) - Ollama troubleshooting
|
||||
3. [GPU_SETUP](GPU_SETUP.md) - GPU troubleshooting
|
||||
|
||||
---
|
||||
|
||||
## Documentation Standards
|
||||
|
||||
### File Naming
|
||||
- Use UPPERCASE for main docs (README, QUICKSTART)
|
||||
- Use Title_Case for technical docs (GPU_Setup, API_Reference)
|
||||
- Use descriptive names (not DOC1, DOC2)
|
||||
|
||||
### Organization
|
||||
- Root level: Only essential user-facing docs
|
||||
- docs/: All technical documentation
|
||||
- Keep related content together
|
||||
|
||||
### Content
|
||||
- Start with overview/summary
|
||||
- Include code examples
|
||||
- Add troubleshooting sections
|
||||
- Link to related docs
|
||||
- Keep up to date
|
||||
|
||||
---
|
||||
|
||||
Last Updated: November 2025
|
||||
@@ -248,3 +248,49 @@ docker-compose logs crawler | grep "Title translated"
|
||||
| 10 Articles | 90s | 25s | 3.6x |
|
||||
|
||||
**Tip:** GPU acceleration is most beneficial when processing many articles in batch.
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Integration Complete
|
||||
|
||||
### What's Included
|
||||
|
||||
✅ Ollama service integrated into Docker Compose
|
||||
✅ Automatic model download (phi3:latest, 2.2GB)
|
||||
✅ GPU support with automatic detection
|
||||
✅ CPU fallback when GPU unavailable
|
||||
✅ Internal-only access (secure)
|
||||
✅ Persistent model storage
|
||||
|
||||
### Quick Verification
|
||||
|
||||
```bash
|
||||
# Check Ollama is running
|
||||
docker ps | grep ollama
|
||||
|
||||
# Check model is downloaded
|
||||
docker-compose exec ollama ollama list
|
||||
|
||||
# Test from inside network
|
||||
docker-compose exec crawler python -c "
|
||||
from ollama_client import OllamaClient
|
||||
from config import Config
|
||||
client = OllamaClient(Config.OLLAMA_BASE_URL, Config.OLLAMA_MODEL, Config.OLLAMA_ENABLED)
|
||||
print(client.translate_title('Guten Morgen'))
|
||||
"
|
||||
```
|
||||
|
||||
### Performance
|
||||
|
||||
**CPU Mode:**
|
||||
- Translation: ~1.5s per title
|
||||
- Summarization: ~8s per article
|
||||
- Suitable for <20 articles/day
|
||||
|
||||
**GPU Mode:**
|
||||
- Translation: ~0.3s per title (5x faster)
|
||||
- Summarization: ~2s per article (4x faster)
|
||||
- Suitable for high-volume processing
|
||||
|
||||
See [GPU_SETUP.md](GPU_SETUP.md) for GPU acceleration setup.
|
||||
|
||||
@@ -1,10 +1,21 @@
|
||||
# Security Notes
|
||||
|
||||
## Ollama Service Security
|
||||
## Network Security Architecture
|
||||
|
||||
### Internal-Only Access
|
||||
### Internal-Only Services
|
||||
|
||||
The Ollama service is configured to be **internal-only** and is not exposed to the host machine or external network. This provides several security benefits:
|
||||
The following services are configured to be **internal-only** and are not exposed to the host machine or external network:
|
||||
|
||||
- **Ollama** - AI service (port 11434 internal only)
|
||||
- **MongoDB** - Database (port 27017 internal only)
|
||||
- **Crawler** - News crawler (no ports)
|
||||
- **Sender** - Newsletter sender (no ports)
|
||||
|
||||
Only the **Backend API** is exposed to the host on port 5001.
|
||||
|
||||
This provides several security benefits:
|
||||
|
||||
### Ollama Service Security
|
||||
|
||||
**Configuration:**
|
||||
```yaml
|
||||
@@ -95,14 +106,16 @@ ollama:
|
||||
### Other Security Considerations
|
||||
|
||||
**MongoDB:**
|
||||
- Exposed on port 27017 for development
|
||||
- ✅ **Internal-only** (not exposed to host)
|
||||
- Uses authentication (username/password)
|
||||
- Consider restricting to localhost in production: `127.0.0.1:27017:27017`
|
||||
- Only accessible via Docker network
|
||||
- Cannot be accessed from host machine or external network
|
||||
|
||||
**Backend API:**
|
||||
- Exposed on port 5001 for tracking and admin functions
|
||||
- Should be behind reverse proxy in production
|
||||
- Consider adding authentication for admin endpoints
|
||||
- In production, bind to localhost only: `127.0.0.1:5001:5001`
|
||||
|
||||
**Email Credentials:**
|
||||
- Stored in `.env` file
|
||||
@@ -118,18 +131,27 @@ ollama:
|
||||
external: true
|
||||
```
|
||||
|
||||
2. **Restrict Network Access**:
|
||||
2. **Restrict Backend to Localhost** (if not using reverse proxy):
|
||||
```yaml
|
||||
ports:
|
||||
- "127.0.0.1:27017:27017" # MongoDB
|
||||
- "127.0.0.1:5001:5001" # Backend
|
||||
backend:
|
||||
ports:
|
||||
- "127.0.0.1:5001:5001" # Only accessible from localhost
|
||||
```
|
||||
|
||||
3. **Use Reverse Proxy** (nginx, Traefik):
|
||||
3. **Use Reverse Proxy** (nginx, Traefik) - Recommended:
|
||||
```yaml
|
||||
backend:
|
||||
# Remove ports section - only accessible via reverse proxy
|
||||
expose:
|
||||
- "5001"
|
||||
```
|
||||
|
||||
Benefits:
|
||||
- SSL/TLS termination
|
||||
- Rate limiting
|
||||
- Authentication
|
||||
- Access logs
|
||||
- DDoS protection
|
||||
|
||||
4. **Regular Updates**:
|
||||
```bash
|
||||
@@ -142,13 +164,22 @@ ollama:
|
||||
docker-compose logs -f
|
||||
```
|
||||
|
||||
6. **Network Isolation**:
|
||||
- ✅ Already configured: MongoDB, Ollama, Crawler, Sender are internal-only
|
||||
- Only Backend API is exposed
|
||||
- All services communicate via internal Docker network
|
||||
|
||||
### Security Checklist
|
||||
|
||||
- [x] Ollama is internal-only (no exposed ports)
|
||||
- [x] MongoDB is internal-only (no exposed ports)
|
||||
- [x] MongoDB uses authentication
|
||||
- [x] Crawler is internal-only (no exposed ports)
|
||||
- [x] Sender is internal-only (no exposed ports)
|
||||
- [x] Only Backend API is exposed (port 5001)
|
||||
- [x] `.env` file is in `.gitignore`
|
||||
- [ ] Backend API has authentication (if needed)
|
||||
- [ ] Using HTTPS in production
|
||||
- [ ] Using HTTPS in production (reverse proxy)
|
||||
- [ ] Regular security updates
|
||||
- [ ] Monitoring and logging enabled
|
||||
- [ ] Backup strategy in place
|
||||
@@ -158,3 +189,99 @@ ollama:
|
||||
If you discover a security vulnerability, please email security@example.com (replace with your contact).
|
||||
|
||||
Do not open public issues for security vulnerabilities.
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Network Isolation Summary
|
||||
|
||||
### Current Port Exposure
|
||||
|
||||
| Service | Port | Exposed to Host | Security Status |
|
||||
|---------|------|-----------------|-----------------|
|
||||
| Backend API | 5001 | ✅ Yes | Only exposed service |
|
||||
| MongoDB | 27017 | ❌ No | Internal only |
|
||||
| Ollama | 11434 | ❌ No | Internal only |
|
||||
| Crawler | - | ❌ No | Internal only |
|
||||
| Sender | - | ❌ No | Internal only |
|
||||
|
||||
### Security Improvements Applied
|
||||
|
||||
**Ollama Service:**
|
||||
- Changed from exposed (port 11434) to internal-only
|
||||
- Only accessible via Docker network
|
||||
- Prevents unauthorized AI model usage
|
||||
|
||||
**MongoDB Service:**
|
||||
- Changed from exposed (port 27017) to internal-only
|
||||
- Only accessible via Docker network
|
||||
- Prevents unauthorized database access
|
||||
|
||||
**Result:**
|
||||
- 66% reduction in attack surface (3 services → 1 service exposed)
|
||||
- Better defense in depth
|
||||
- Production-ready security configuration
|
||||
|
||||
### Verification Commands
|
||||
|
||||
```bash
|
||||
# Check what's exposed
|
||||
docker ps --format "table {{.Names}}\t{{.Ports}}"
|
||||
|
||||
# Expected output:
|
||||
# Backend: 0.0.0.0:5001->5001/tcp ← Only this exposed
|
||||
# MongoDB: 27017/tcp ← Internal only
|
||||
# Ollama: 11434/tcp ← Internal only
|
||||
|
||||
# Test MongoDB not accessible from host
|
||||
nc -z -w 2 localhost 27017 # Should fail
|
||||
|
||||
# Test Ollama not accessible from host
|
||||
nc -z -w 2 localhost 11434 # Should fail
|
||||
|
||||
# Test Backend accessible from host
|
||||
curl http://localhost:5001/health # Should work
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## MongoDB Connection Security
|
||||
|
||||
### Configuration
|
||||
|
||||
**Inside Docker Network:**
|
||||
```env
|
||||
MONGODB_URI=mongodb://admin:changeme@mongodb:27017/
|
||||
```
|
||||
- Uses `mongodb` (Docker service name)
|
||||
- Only works inside Docker network
|
||||
- Cannot be accessed from host
|
||||
|
||||
**Connection Flow:**
|
||||
1. Service reads `MONGODB_URI` from environment
|
||||
2. Docker DNS resolves `mongodb` to container IP
|
||||
3. Connection established via internal network
|
||||
4. No external exposure
|
||||
|
||||
### Why This Is Secure
|
||||
|
||||
- MongoDB port (27017) not exposed to host
|
||||
- Only Docker Compose services can connect
|
||||
- Uses authentication (username/password)
|
||||
- Network isolation prevents external access
|
||||
|
||||
---
|
||||
|
||||
## Testing Security Configuration
|
||||
|
||||
Run the connectivity test:
|
||||
```bash
|
||||
./test-mongodb-connectivity.sh
|
||||
```
|
||||
|
||||
Expected results:
|
||||
- ✅ MongoDB NOT accessible from host
|
||||
- ✅ Backend CAN connect to MongoDB
|
||||
- ✅ Crawler CAN connect to MongoDB
|
||||
- ✅ Sender CAN connect to MongoDB
|
||||
- ✅ Backend API accessible from host
|
||||
|
||||
Reference in New Issue
Block a user