dongho/Munich-news

Fork 0

Files

Dongho Kim f35f8eef8a update

2025-11-11 17:58:12 +01:00

5.9 KiB

Raw Blame History

Changelog

[Unreleased] - 2024-11-10

Added - Major Refactoring

Backend Modularization

✅ Restructured backend into modular architecture
✅ Created separate route blueprints:
- subscription_routes.py - User subscriptions
- news_routes.py - News fetching and stats
- rss_routes.py - RSS feed management (CRUD)
- ollama_routes.py - AI integration
✅ Created service layer:
- news_service.py - News fetching logic
- email_service.py - Newsletter sending
- ollama_service.py - AI communication
✅ Centralized configuration in config.py
✅ Separated database logic in database.py
✅ Reduced main app.py from 700+ lines to 27 lines

RSS Feed Management

✅ Dynamic RSS feed management via API
✅ Add/remove/list/toggle RSS feeds without code changes
✅ Unique index on RSS feed URLs (prevents duplicates)
✅ Default feeds auto-initialized on first run
✅ Created fix_duplicates.py utility script

News Crawler Microservice

✅ Created standalone news_crawler/ microservice
✅ Web scraping with BeautifulSoup
✅ Smart content extraction using multiple selectors
✅ Full article content storage in MongoDB
✅ Word count calculation
✅ Duplicate prevention (skips already-crawled articles)
✅ Rate limiting (1 second between requests)
✅ Can run independently or scheduled
✅ Docker support for crawler
✅ Comprehensive documentation

API Endpoints

New endpoints added:

GET /api/rss-feeds - List all RSS feeds
POST /api/rss-feeds - Add new RSS feed
DELETE /api/rss-feeds/<id> - Remove RSS feed
PATCH /api/rss-feeds/<id>/toggle - Toggle feed active status

Documentation

✅ Created ARCHITECTURE.md - System architecture overview
✅ Created backend/STRUCTURE.md - Backend structure guide
✅ Created news_crawler/README.md - Crawler documentation
✅ Created news_crawler/QUICKSTART.md - Quick start guide
✅ Created news_crawler/test_crawler.py - Test suite
✅ Updated main README.md with new features
✅ Updated DATABASE_SCHEMA.md with new fields

Configuration

✅ Added FLASK_PORT environment variable
✅ Fixed OLLAMA_MODEL typo in .env
✅ Port 5001 default to avoid macOS AirPlay conflict

Changed

Backend structure: Monolithic → Modular
RSS feeds: Hardcoded → Database-driven
Article storage: Summary only → Full content support
Configuration: Scattered → Centralized

Technical Improvements

Separation of concerns (routes vs services)
Better testability
Easier maintenance
Scalable architecture
Independent microservices
Proper error handling
Comprehensive logging

Database Schema Updates

Articles collection now includes:

full_content - Full article text
word_count - Number of words
crawled_at - When content was crawled

RSS Feeds collection added:

name - Feed name
url - Feed URL (unique)
active - Active status
created_at - Creation timestamp

Files Added

backend/
├── config.py
├── database.py
├── fix_duplicates.py
├── STRUCTURE.md
├── routes/
│   ├── __init__.py
│   ├── subscription_routes.py
│   ├── news_routes.py
│   ├── rss_routes.py
│   └── ollama_routes.py
└── services/
    ├── __init__.py
    ├── news_service.py
    ├── email_service.py
    └── ollama_service.py

news_crawler/
├── crawler_service.py
├── test_crawler.py
├── requirements.txt
├── .gitignore
├── Dockerfile
├── docker-compose.yml
├── README.md
└── QUICKSTART.md

Root:
├── ARCHITECTURE.md
└── CHANGELOG.md

Files Removed

Old monolithic backend/app.py (replaced with modular version)

Next Steps (Future Enhancements)

Frontend UI for RSS feed management
Automatic article summarization with Ollama
Scheduled newsletter sending
Article categorization and tagging
Search functionality
User preferences (categories, frequency)
Analytics dashboard
API rate limiting
Caching layer (Redis)
Message queue for crawler (Celery)

Recent Updates (November 2025)

Security Improvements

MongoDB Internal-Only: Removed port exposure, only accessible via Docker network
Ollama Internal-Only: Removed port exposure, only accessible via Docker network
Reduced Attack Surface: Only Backend API (port 5001) exposed to host
Network Isolation: All services communicate via internal Docker network

Ollama Integration

Docker Compose Integration: Ollama service runs alongside other services
Automatic Model Download: phi3:latest model downloaded on first startup
GPU Support: NVIDIA GPU acceleration with automatic detection
Helper Scripts: start-with-gpu.sh, check-gpu.sh, configure-ollama.sh
Performance: 5-10x faster with GPU acceleration

API Enhancements

Send Newsletter Endpoint: /api/admin/send-newsletter to send to all active subscribers
Subscriber Status Fix: Fixed stats endpoint to correctly count active subscribers
Better Error Handling: Improved error messages and validation

Documentation

Consolidated Documentation: Moved all docs to docs/ directory
Security Guide: Comprehensive security documentation
GPU Setup Guide: Detailed GPU acceleration setup
MongoDB Connection Guide: Connection configuration explained
Subscriber Status Guide: How subscriber status system works

Configuration

MongoDB URI: Updated to use Docker service name (mongodb instead of localhost)
Ollama URL: Configured for internal Docker network (http://ollama:11434)
Single .env File: All configuration in backend/.env

Testing

Connectivity Tests: test-mongodb-connectivity.sh
Ollama Tests: test-ollama-setup.sh
Newsletter API Tests: test-newsletter-api.sh

5.9 KiB

Raw Blame History

Changelog

[Unreleased] - 2024-11-10

Added - Major Refactoring

Backend Modularization

RSS Feed Management

News Crawler Microservice

API Endpoints

Documentation

Configuration

Changed

Technical Improvements

Database Schema Updates

Files Added

Files Removed

Next Steps (Future Enhancements)

Recent Updates (November 2025)

Security Improvements

Ollama Integration

API Enhancements

Documentation

Configuration

Testing

Build together

Resources

Get help

5.9 KiB Raw Blame History

Changelog

[Unreleased] - 2024-11-10

Added - Major Refactoring

Backend Modularization

RSS Feed Management

News Crawler Microservice

API Endpoints

Documentation

Configuration

Changed

Technical Improvements

Database Schema Updates

Files Added

Files Removed

Next Steps (Future Enhancements)

Recent Updates (November 2025)

Security Improvements

Ollama Integration

API Enhancements

Documentation

Configuration

Testing

Build together

Resources

Get help

5.9 KiB

Raw Blame History