# Munich News Daily - Automated Newsletter System A fully automated news aggregation system that crawls Munich news sources, generates AI-powered summaries, tracks local transport disruptions, and delivers personalized daily newsletters. ![Munich News Daily](https://via.placeholder.com/800x400?text=Munich+News+Daily+Dashboard) ## ✨ Key Features - **🤖 AI-Powered Clustering** - Smartly detects duplicate stories and groups related articles using ChromaDB vector search. - **📝 Neutral Summaries** - Generates balanced, multi-perspective summaries using local LLMs (Ollama). - **🚇 Transport Updates** - Real-time tracking of Munich public transport (MVG) disruptions options. - **🎯 Smart Prioritization** - Ranks stories based on relevance and user preferences. - **🎨 Personalized Newsletters** - diverse content delivery system. - **📊 Engagement Analytics** - Detailed tracking of open rates, click-throughs, and user interests. - **⚡ GPU Acceleration** - Integrated support for NVIDIA GPUs for faster AI processing. - **🔒 Privacy First** - GDPR-compliant with automatic data retention policies and anonymization. ## 🚀 Quick Start For a detailed 5-minute setup guide, see [QUICKSTART.md](QUICKSTART.md). ```bash # 1. Configure environment cp backend/.env.example backend/.env # Edit backend/.env with your email settings # 2. Start everything (Auto-detects GPU) ./start-with-gpu.sh # Questions? # See logs: docker-compose logs -f ``` The system will automatically: 1. **6:00 AM**: Crawl news & transport updates. 2. **6:30 AM**: Generate AI summaries & clusters. 3. **7:00 AM**: Send personalized newsletters. ## 📋 System Architecture The system is built as a set of microservices orchestrated by Docker Compose. ```mermaid graph TD User[Subscribers] -->|Email| Sender[Newsletter Sender] User -->|Web| Frontend[React Frontend] Frontend -->|API| Backend[Backend API] subgraph "Core Services" Crawler[News Crawler] Transport[Transport Crawler] Sender Backend end subgraph "Data & AI" Mongo[(MongoDB)] Redis[(Redis)] Chroma[(ChromaDB)] Ollama[Ollama AI] end Crawler -->|Save| Mongo Crawler -->|Embeddings| Chroma Crawler -->|Summarize| Ollama Transport -->|Save| Mongo Sender -->|Read| Mongo Sender -->|Track| Backend Backend -->|Read/Write| Mongo Backend -->|Cache| Redis ``` ### Core Components | Service | Description | Port | |---------|-------------|------| | **Frontend** | React-based user dashboard and admin interface. | 3000 | | **Backend API** | Flask API for tracking, analytics, and management. | 5001 | | **News Crawler** | Fetches RSS feeds, extracts content, and runs AI clustering. | - | | **Transport Crawler** | Monitors MVG (Munich Transport) for delays and disruptions. | - | | **Newsletter Sender** | Manages subscribers, generates templates, and sends emails. | - | | **Ollama** | Local LLM runner for on-premise AI (Phi-3, Llama3, etc.). | - | | **ChromaDB** | Vector database for semantic search and article clustering. | - | ## 📂 Project Structure ```text munich-news/ ├── backend/ # Flask API for tracking & analytics ├── frontend/ # React dashboard & admin UI ├── news_crawler/ # RSS fetcher & AI summarizer service ├── news_sender/ # Email generation & dispatch service ├── transport_crawler/ # MVG transport disruption monitor ├── docker-compose.yml # Main service orchestration └── docs/ # Detailed documentation ``` ## 🛠️ Installation & Setup 1. **Clone the repository** ```bash git clone https://github.com/yourusername/munich-news.git cd munich-news ``` 2. **Environment Configuration** ```bash cp backend/.env.example backend/.env nano backend/.env ``` *Critical settings:* `SMTP_SERVER`, `EMAIL_USER`, `EMAIL_PASSWORD`. 3. **Start the System** ```bash # Recommended: Helper script (handles GPU & Model setup) ./start-with-gpu.sh # Alternative: Standard Docker Compose docker-compose up -d ``` 4. **Initial Setup (First Run)** * The system needs to download the AI model (approx. 2GB). * Watch progress: `docker-compose logs -f ollama-setup` ## ⚙️ Configuration Key configuration options in `backend/.env`: | Category | Variable | Description | |----------|----------|-------------| | **Email** | `SMTP_SERVER` | SMTP Server (e.g., smtp.gmail.com) | | | `EMAIL_USER` | Your sending email address | | **AI** | `OLLAMA_MODEL` | Model to use (default: phi3:latest) | | **Schedule** | `CRAWLER_TIME` | Time to start crawling (e.g., "06:00") | | | `SENDER_TIME` | Time to send emails (e.g., "07:00") | ## 📊 Usage & Monitoring ### Access Points * **Web Dashboard**: [http://localhost:3000](http://localhost:3000) (or configured domain) * **API**: [http://localhost:5001](http://localhost:5001) ### Useful Commands **View Logs** ```bash docker-compose logs -f [service_name] # e.g., docker-compose logs -f crawler ``` **Manual Trigger** ```bash # Run News Crawler immediately docker-compose exec crawler python crawler_service.py 10 # Run Transport Crawler immediately docker-compose exec transport-crawler python transport_service.py # Send Test Newsletter docker-compose exec sender python sender_service.py test user@example.com ``` **Database Access** ```bash # Connect to MongoDB docker-compose exec mongodb mongosh munich_news ``` ## 🌐 Production Deployment (Traefik) This project is configured to work with **Traefik** as a reverse proxy. The `docker-compose.yml` includes labels for: - `news.dongho.kim` (Frontend) - `news-api.dongho.kim` (Backend) To use this locally, add these to your `/etc/hosts`: ```text 127.0.0.1 news.dongho.kim news-api.dongho.kim ``` For production, ensure your Traefik proxy network is named `proxy` or update the `docker-compose.yml` accordingly. ## 🤝 Contributing We welcome contributions! Please check [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. ## 📄 License MIT License - see [LICENSE](LICENSE) for details.