Files
Munich-news/README.md
2025-11-10 19:13:33 +01:00

328 lines
9.3 KiB
Markdown

# Munich News Daily 📰
A TLDR/Morning Brew-style news email platform specifically for Munich. Get the latest Munich news delivered to your inbox every morning.
## Features
- 📧 Email newsletter subscription system
- 📰 Aggregated news from multiple Munich news sources
- 🎨 Beautiful, modern web interface
- 📊 Subscription statistics
- 🔄 Real-time news updates
## Tech Stack
- **Backend**: Python (Flask) - Modular architecture with blueprints
- **Frontend**: Node.js (Express + Vanilla JavaScript)
- **Database**: MongoDB
- **News Crawler**: Standalone Python microservice
- **News Sources**: RSS feeds from major Munich news outlets
## Setup Instructions
### Prerequisites
- Python 3.8+
- Node.js 14+
- npm or yarn
- Docker and Docker Compose (recommended for MongoDB) OR MongoDB (local installation or MongoDB Atlas account)
### Backend Setup
1. Navigate to the backend directory:
```bash
cd backend
```
2. Create a virtual environment (recommended):
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. Install dependencies:
```bash
pip install -r requirements.txt
```
4. Set up MongoDB using Docker Compose (recommended):
```bash
# From the project root directory
docker-compose up -d
```
This will start MongoDB in a Docker container. The database will be available at `mongodb://localhost:27017/`
**Useful Docker commands:**
```bash
# Start MongoDB
docker-compose up -d
# Stop MongoDB
docker-compose down
# View MongoDB logs
docker-compose logs -f mongodb
# Restart MongoDB
docker-compose restart mongodb
# Remove MongoDB and all data (WARNING: deletes all data)
docker-compose down -v
```
**Alternative options:**
- **Local MongoDB**: Install MongoDB locally and make sure it's running
- **MongoDB Atlas** (Cloud): Create a free account at [mongodb.com/cloud/atlas](https://www.mongodb.com/cloud/atlas) and get your connection string
5. Create a `.env` file in the backend directory:
```bash
# Copy the template file
cp env.template .env
```
Then edit `.env` with your configuration:
```env
# MongoDB connection (default: mongodb://localhost:27017/)
# For Docker Compose (no authentication):
MONGODB_URI=mongodb://localhost:27017/
# For Docker Compose with authentication (if you modify docker-compose.yml):
# MONGODB_URI=mongodb://admin:password@localhost:27017/
# Or for MongoDB Atlas:
# MONGODB_URI=mongodb+srv://username:password@cluster.mongodb.net/
# Email configuration (optional for testing)
SMTP_SERVER=smtp.gmail.com
SMTP_PORT=587
EMAIL_USER=your-email@gmail.com
EMAIL_PASSWORD=your-app-password
# Ollama Configuration (for AI-powered features)
# Remote Ollama server URL
OLLAMA_BASE_URL=http://your-remote-server-ip:11434
# Optional: API key if your Ollama server requires authentication
# OLLAMA_API_KEY=your-api-key-here
# Model name to use (e.g., llama2, mistral, codellama, llama3)
OLLAMA_MODEL=llama2
# Enable/disable Ollama features (true/false)
OLLAMA_ENABLED=false
```
**Notes:**
- For Gmail, you'll need to use an [App Password](https://support.google.com/accounts/answer/185833) instead of your regular password.
- For Ollama, replace `your-remote-server-ip` with your actual server IP or domain. Set `OLLAMA_ENABLED=true` to enable AI features.
6. Run the backend server:
```bash
python app.py
```
The backend will run on `http://localhost:5001` (port 5001 to avoid conflict with AirPlay on macOS)
### Frontend Setup
1. Navigate to the frontend directory:
```bash
cd frontend
```
2. Install dependencies:
```bash
npm install
```
3. Run the frontend server:
```bash
npm start
```
The frontend will run on `http://localhost:3000`
## Usage
1. Open your browser and go to `http://localhost:3000`
2. Enter your email address to subscribe to the newsletter
3. View the latest Munich news on the homepage
4. The backend will aggregate news from multiple Munich news sources
## Sending Newsletters
To send newsletters to all subscribers, you can add a scheduled task or manually trigger the `send_newsletter()` function in `app.py`. For production, consider using:
- **Cron jobs** (Linux/Mac)
- **Task Scheduler** (Windows)
- **Celery** with Redis/RabbitMQ for more advanced scheduling
- **Cloud functions** (AWS Lambda, Google Cloud Functions)
Example cron job to send daily at 8 AM:
```
0 8 * * * cd /path/to/munich-news/backend && python -c "from app import send_newsletter; send_newsletter()"
```
## Project Structure
```
munich-news/
├── backend/ # Main API server
│ ├── app.py # Flask application entry point
│ ├── config.py # Configuration management
│ ├── database.py # Database connection
│ ├── routes/ # API endpoints (blueprints)
│ ├── services/ # Business logic
│ ├── templates/ # Email templates
│ └── requirements.txt # Python dependencies
├── news_crawler/ # Crawler microservice
│ ├── crawler_service.py # Standalone crawler
│ ├── ollama_client.py # AI summarization client
│ ├── requirements.txt # Crawler dependencies
│ └── README.md # Crawler documentation
├── news_sender/ # Newsletter sender microservice
│ ├── sender_service.py # Standalone email sender
│ ├── newsletter_template.html # Email template
│ ├── requirements.txt # Sender dependencies
│ └── README.md # Sender documentation
├── frontend/ # Web interface
│ ├── server.js # Express server
│ ├── package.json # Node.js dependencies
│ └── public/
│ ├── index.html # Main page
│ ├── styles.css # Styling
│ └── app.js # Frontend JavaScript
├── docker-compose.yml # Docker Compose for MongoDB (development)
├── docker-compose.prod.yml # Docker Compose with authentication (production)
└── README.md
```
## API Endpoints
### `POST /api/subscribe`
Subscribe to the newsletter
- Body: `{ "email": "user@example.com" }`
### `POST /api/unsubscribe`
Unsubscribe from the newsletter
- Body: `{ "email": "user@example.com" }`
### `GET /api/news`
Get latest Munich news articles
### `GET /api/stats`
Get subscription statistics
- Returns: `{ "subscribers": number, "articles": number, "crawled_articles": number }`
### `GET /api/news/<article_url>`
Get full article content by URL
- Returns: Full article with content, author, word count, etc.
### `GET /api/ollama/ping`
Test connection to Ollama server
- Returns: Connection status and Ollama configuration
- Response examples:
- Success: `{ "status": "success", "message": "...", "response": "...", "ollama_config": {...} }`
- Disabled: `{ "status": "disabled", "message": "...", "ollama_config": {...} }`
- Error: `{ "status": "error", "message": "...", "error_details": "...", "troubleshooting": {...}, "ollama_config": {...} }`
### `GET /api/ollama/models`
List available models on Ollama server
- Returns: List of available models and current configuration
- Response: `{ "status": "success", "models": [...], "current_model": "...", "ollama_config": {...} }`
### `GET /api/rss-feeds`
Get all RSS feeds
- Returns: `{ "feeds": [...] }`
### `POST /api/rss-feeds`
Add a new RSS feed
- Body: `{ "name": "Feed Name", "url": "https://example.com/rss" }`
- Returns: `{ "message": "...", "id": "..." }`
### `DELETE /api/rss-feeds/<feed_id>`
Remove an RSS feed
- Returns: `{ "message": "..." }`
### `PATCH /api/rss-feeds/<feed_id>/toggle`
Toggle RSS feed active status
- Returns: `{ "message": "...", "active": boolean }`
## Database Schema
### Articles Collection
```javascript
{
_id: ObjectId,
title: String,
link: String (unique),
summary: String,
source: String,
published_at: String,
created_at: DateTime
}
```
### Subscribers Collection
```javascript
{
_id: ObjectId,
email: String (unique, lowercase),
subscribed_at: DateTime,
status: String ('active' | 'inactive')
}
```
**Indexes:**
- `articles.link` - Unique index to prevent duplicate articles
- `articles.created_at` - For efficient sorting
- `subscribers.email` - Unique index for email lookups
- `subscribers.subscribed_at` - For analytics
## News Crawler Microservice
The project includes a standalone crawler microservice that fetches full article content from RSS feeds.
### Running the Crawler
```bash
cd news_crawler
# Install dependencies
pip install -r requirements.txt
# Run crawler
python crawler_service.py 10
```
See `news_crawler/README.md` for detailed documentation.
### What It Does
- Crawls full article content from RSS feed links
- Extracts text, word count, and metadata
- Stores in MongoDB for AI processing
- Skips already-crawled articles
- Rate-limited (1 second between requests)
## Customization
### Adding News Sources
Use the API to add RSS feeds dynamically:
```bash
curl -X POST http://localhost:5001/api/rss-feeds \
-H "Content-Type: application/json" \
-d '{"name": "Your Source Name", "url": "https://example.com/rss"}'
```
### Styling
Modify `frontend/public/styles.css` to customize the appearance.
## License
MIT
## Contributing
Feel free to submit issues and enhancement requests!