6.1 KiB
Munich News Daily - Automated Newsletter System
A fully automated news aggregation system that crawls Munich news sources, generates AI-powered summaries, tracks local transport disruptions, and delivers personalized daily newsletters.
✨ Key Features
- 🤖 AI-Powered Clustering - Smartly detects duplicate stories and groups related articles using ChromaDB vector search.
- 📝 Neutral Summaries - Generates balanced, multi-perspective summaries using local LLMs (Ollama).
- 🚇 Transport Updates - Real-time tracking of Munich public transport (MVG) disruptions options.
- 🎯 Smart Prioritization - Ranks stories based on relevance and user preferences.
- 🎨 Personalized Newsletters - diverse content delivery system.
- 📊 Engagement Analytics - Detailed tracking of open rates, click-throughs, and user interests.
- ⚡ GPU Acceleration - Integrated support for NVIDIA GPUs for faster AI processing.
- 🔒 Privacy First - GDPR-compliant with automatic data retention policies and anonymization.
🚀 Quick Start
For a detailed 5-minute setup guide, see QUICKSTART.md.
# 1. Configure environment
cp backend/.env.example backend/.env
# Edit backend/.env with your email settings
# 2. Start everything (Auto-detects GPU)
./start-with-gpu.sh
# Questions?
# See logs: docker-compose logs -f
The system will automatically:
- 6:00 AM: Crawl news & transport updates.
- 6:30 AM: Generate AI summaries & clusters.
- 7:00 AM: Send personalized newsletters.
📋 System Architecture
The system is built as a set of microservices orchestrated by Docker Compose.
graph TD
User[Subscribers] -->|Email| Sender[Newsletter Sender]
User -->|Web| Frontend[React Frontend]
Frontend -->|API| Backend[Backend API]
subgraph "Core Services"
Crawler[News Crawler]
Transport[Transport Crawler]
Sender
Backend
end
subgraph "Data & AI"
Mongo[(MongoDB)]
Redis[(Redis)]
Chroma[(ChromaDB)]
Ollama[Ollama AI]
end
Crawler -->|Save| Mongo
Crawler -->|Embeddings| Chroma
Crawler -->|Summarize| Ollama
Transport -->|Save| Mongo
Sender -->|Read| Mongo
Sender -->|Track| Backend
Backend -->|Read/Write| Mongo
Backend -->|Cache| Redis
Core Components
| Service | Description | Port |
|---|---|---|
| Frontend | React-based user dashboard and admin interface. | 3000 |
| Backend API | Flask API for tracking, analytics, and management. | 5001 |
| News Crawler | Fetches RSS feeds, extracts content, and runs AI clustering. | - |
| Transport Crawler | Monitors MVG (Munich Transport) for delays and disruptions. | - |
| Newsletter Sender | Manages subscribers, generates templates, and sends emails. | - |
| Ollama | Local LLM runner for on-premise AI (Phi-3, Llama3, etc.). | - |
| ChromaDB | Vector database for semantic search and article clustering. | - |
📂 Project Structure
munich-news/
├── backend/ # Flask API for tracking & analytics
├── frontend/ # React dashboard & admin UI
├── news_crawler/ # RSS fetcher & AI summarizer service
├── news_sender/ # Email generation & dispatch service
├── transport_crawler/ # MVG transport disruption monitor
├── docker-compose.yml # Main service orchestration
└── docs/ # Detailed documentation
🛠️ Installation & Setup
-
Clone the repository
git clone https://github.com/yourusername/munich-news.git cd munich-news -
Environment Configuration
cp backend/.env.example backend/.env nano backend/.envCritical settings:
SMTP_SERVER,EMAIL_USER,EMAIL_PASSWORD. -
Start the System
# Recommended: Helper script (handles GPU & Model setup) ./start-with-gpu.sh # Alternative: Standard Docker Compose docker-compose up -d -
Initial Setup (First Run)
- The system needs to download the AI model (approx. 2GB).
- Watch progress:
docker-compose logs -f ollama-setup
⚙️ Configuration
Key configuration options in backend/.env:
| Category | Variable | Description |
|---|---|---|
SMTP_SERVER |
SMTP Server (e.g., smtp.gmail.com) | |
EMAIL_USER |
Your sending email address | |
| AI | OLLAMA_MODEL |
Model to use (default: phi3:latest) |
| Schedule | CRAWLER_TIME |
Time to start crawling (e.g., "06:00") |
SENDER_TIME |
Time to send emails (e.g., "07:00") |
📊 Usage & Monitoring
Access Points
- Web Dashboard: http://localhost:3000 (or configured domain)
- API: http://localhost:5001
Useful Commands
View Logs
docker-compose logs -f [service_name]
# e.g., docker-compose logs -f crawler
Manual Trigger
# Run News Crawler immediately
docker-compose exec crawler python crawler_service.py 10
# Run Transport Crawler immediately
docker-compose exec transport-crawler python transport_service.py
# Send Test Newsletter
docker-compose exec sender python sender_service.py test user@example.com
Database Access
# Connect to MongoDB
docker-compose exec mongodb mongosh munich_news
🌐 Production Deployment (Traefik)
This project is configured to work with Traefik as a reverse proxy.
The docker-compose.yml includes labels for:
news.dongho.kim(Frontend)news-api.dongho.kim(Backend)
To use this locally, add these to your /etc/hosts:
127.0.0.1 news.dongho.kim news-api.dongho.kim
For production, ensure your Traefik proxy network is named proxy or update the docker-compose.yml accordingly.
🤝 Contributing
We welcome contributions! Please check CONTRIBUTING.md for guidelines.
📄 License
MIT License - see LICENSE for details.