5.9 KiB
5.9 KiB
Changelog
[Unreleased] - 2024-11-10
Added - Major Refactoring
Backend Modularization
- ✅ Restructured backend into modular architecture
- ✅ Created separate route blueprints:
subscription_routes.py- User subscriptionsnews_routes.py- News fetching and statsrss_routes.py- RSS feed management (CRUD)ollama_routes.py- AI integration
- ✅ Created service layer:
news_service.py- News fetching logicemail_service.py- Newsletter sendingollama_service.py- AI communication
- ✅ Centralized configuration in
config.py - ✅ Separated database logic in
database.py - ✅ Reduced main
app.pyfrom 700+ lines to 27 lines
RSS Feed Management
- ✅ Dynamic RSS feed management via API
- ✅ Add/remove/list/toggle RSS feeds without code changes
- ✅ Unique index on RSS feed URLs (prevents duplicates)
- ✅ Default feeds auto-initialized on first run
- ✅ Created
fix_duplicates.pyutility script
News Crawler Microservice
- ✅ Created standalone
news_crawler/microservice - ✅ Web scraping with BeautifulSoup
- ✅ Smart content extraction using multiple selectors
- ✅ Full article content storage in MongoDB
- ✅ Word count calculation
- ✅ Duplicate prevention (skips already-crawled articles)
- ✅ Rate limiting (1 second between requests)
- ✅ Can run independently or scheduled
- ✅ Docker support for crawler
- ✅ Comprehensive documentation
API Endpoints
New endpoints added:
GET /api/rss-feeds- List all RSS feedsPOST /api/rss-feeds- Add new RSS feedDELETE /api/rss-feeds/<id>- Remove RSS feedPATCH /api/rss-feeds/<id>/toggle- Toggle feed active status
Documentation
- ✅ Created
ARCHITECTURE.md- System architecture overview - ✅ Created
backend/STRUCTURE.md- Backend structure guide - ✅ Created
news_crawler/README.md- Crawler documentation - ✅ Created
news_crawler/QUICKSTART.md- Quick start guide - ✅ Created
news_crawler/test_crawler.py- Test suite - ✅ Updated main
README.mdwith new features - ✅ Updated
DATABASE_SCHEMA.mdwith new fields
Configuration
- ✅ Added
FLASK_PORTenvironment variable - ✅ Fixed
OLLAMA_MODELtypo in.env - ✅ Port 5001 default to avoid macOS AirPlay conflict
Changed
- Backend structure: Monolithic → Modular
- RSS feeds: Hardcoded → Database-driven
- Article storage: Summary only → Full content support
- Configuration: Scattered → Centralized
Technical Improvements
- Separation of concerns (routes vs services)
- Better testability
- Easier maintenance
- Scalable architecture
- Independent microservices
- Proper error handling
- Comprehensive logging
Database Schema Updates
Articles collection now includes:
full_content- Full article textword_count- Number of wordscrawled_at- When content was crawled
RSS Feeds collection added:
name- Feed nameurl- Feed URL (unique)active- Active statuscreated_at- Creation timestamp
Files Added
backend/
├── config.py
├── database.py
├── fix_duplicates.py
├── STRUCTURE.md
├── routes/
│ ├── __init__.py
│ ├── subscription_routes.py
│ ├── news_routes.py
│ ├── rss_routes.py
│ └── ollama_routes.py
└── services/
├── __init__.py
├── news_service.py
├── email_service.py
└── ollama_service.py
news_crawler/
├── crawler_service.py
├── test_crawler.py
├── requirements.txt
├── .gitignore
├── Dockerfile
├── docker-compose.yml
├── README.md
└── QUICKSTART.md
Root:
├── ARCHITECTURE.md
└── CHANGELOG.md
Files Removed
- Old monolithic
backend/app.py(replaced with modular version)
Next Steps (Future Enhancements)
- Frontend UI for RSS feed management
- Automatic article summarization with Ollama
- Scheduled newsletter sending
- Article categorization and tagging
- Search functionality
- User preferences (categories, frequency)
- Analytics dashboard
- API rate limiting
- Caching layer (Redis)
- Message queue for crawler (Celery)
Recent Updates (November 2025)
Security Improvements
- MongoDB Internal-Only: Removed port exposure, only accessible via Docker network
- Ollama Internal-Only: Removed port exposure, only accessible via Docker network
- Reduced Attack Surface: Only Backend API (port 5001) exposed to host
- Network Isolation: All services communicate via internal Docker network
Ollama Integration
- Docker Compose Integration: Ollama service runs alongside other services
- Automatic Model Download: phi3:latest model downloaded on first startup
- GPU Support: NVIDIA GPU acceleration with automatic detection
- Helper Scripts:
start-with-gpu.sh,check-gpu.sh,configure-ollama.sh - Performance: 5-10x faster with GPU acceleration
API Enhancements
- Send Newsletter Endpoint:
/api/admin/send-newsletterto send to all active subscribers - Subscriber Status Fix: Fixed stats endpoint to correctly count active subscribers
- Better Error Handling: Improved error messages and validation
Documentation
- Consolidated Documentation: Moved all docs to
docs/directory - Security Guide: Comprehensive security documentation
- GPU Setup Guide: Detailed GPU acceleration setup
- MongoDB Connection Guide: Connection configuration explained
- Subscriber Status Guide: How subscriber status system works
Configuration
- MongoDB URI: Updated to use Docker service name (
mongodbinstead oflocalhost) - Ollama URL: Configured for internal Docker network (
http://ollama:11434) - Single .env File: All configuration in
backend/.env
Testing
- Connectivity Tests:
test-mongodb-connectivity.sh - Ollama Tests:
test-ollama-setup.sh - Newsletter API Tests:
test-newsletter-api.sh