137 lines
4.1 KiB
Markdown
137 lines
4.1 KiB
Markdown
# Changelog
|
|
|
|
## [Unreleased] - 2024-11-10
|
|
|
|
### Added - Major Refactoring
|
|
|
|
#### Backend Modularization
|
|
- ✅ Restructured backend into modular architecture
|
|
- ✅ Created separate route blueprints:
|
|
- `subscription_routes.py` - User subscriptions
|
|
- `news_routes.py` - News fetching and stats
|
|
- `rss_routes.py` - RSS feed management (CRUD)
|
|
- `ollama_routes.py` - AI integration
|
|
- ✅ Created service layer:
|
|
- `news_service.py` - News fetching logic
|
|
- `email_service.py` - Newsletter sending
|
|
- `ollama_service.py` - AI communication
|
|
- ✅ Centralized configuration in `config.py`
|
|
- ✅ Separated database logic in `database.py`
|
|
- ✅ Reduced main `app.py` from 700+ lines to 27 lines
|
|
|
|
#### RSS Feed Management
|
|
- ✅ Dynamic RSS feed management via API
|
|
- ✅ Add/remove/list/toggle RSS feeds without code changes
|
|
- ✅ Unique index on RSS feed URLs (prevents duplicates)
|
|
- ✅ Default feeds auto-initialized on first run
|
|
- ✅ Created `fix_duplicates.py` utility script
|
|
|
|
#### News Crawler Microservice
|
|
- ✅ Created standalone `news_crawler/` microservice
|
|
- ✅ Web scraping with BeautifulSoup
|
|
- ✅ Smart content extraction using multiple selectors
|
|
- ✅ Full article content storage in MongoDB
|
|
- ✅ Word count calculation
|
|
- ✅ Duplicate prevention (skips already-crawled articles)
|
|
- ✅ Rate limiting (1 second between requests)
|
|
- ✅ Can run independently or scheduled
|
|
- ✅ Docker support for crawler
|
|
- ✅ Comprehensive documentation
|
|
|
|
#### API Endpoints
|
|
New endpoints added:
|
|
- `GET /api/rss-feeds` - List all RSS feeds
|
|
- `POST /api/rss-feeds` - Add new RSS feed
|
|
- `DELETE /api/rss-feeds/<id>` - Remove RSS feed
|
|
- `PATCH /api/rss-feeds/<id>/toggle` - Toggle feed active status
|
|
|
|
#### Documentation
|
|
- ✅ Created `ARCHITECTURE.md` - System architecture overview
|
|
- ✅ Created `backend/STRUCTURE.md` - Backend structure guide
|
|
- ✅ Created `news_crawler/README.md` - Crawler documentation
|
|
- ✅ Created `news_crawler/QUICKSTART.md` - Quick start guide
|
|
- ✅ Created `news_crawler/test_crawler.py` - Test suite
|
|
- ✅ Updated main `README.md` with new features
|
|
- ✅ Updated `DATABASE_SCHEMA.md` with new fields
|
|
|
|
#### Configuration
|
|
- ✅ Added `FLASK_PORT` environment variable
|
|
- ✅ Fixed `OLLAMA_MODEL` typo in `.env`
|
|
- ✅ Port 5001 default to avoid macOS AirPlay conflict
|
|
|
|
### Changed
|
|
- Backend structure: Monolithic → Modular
|
|
- RSS feeds: Hardcoded → Database-driven
|
|
- Article storage: Summary only → Full content support
|
|
- Configuration: Scattered → Centralized
|
|
|
|
### Technical Improvements
|
|
- Separation of concerns (routes vs services)
|
|
- Better testability
|
|
- Easier maintenance
|
|
- Scalable architecture
|
|
- Independent microservices
|
|
- Proper error handling
|
|
- Comprehensive logging
|
|
|
|
### Database Schema Updates
|
|
Articles collection now includes:
|
|
- `full_content` - Full article text
|
|
- `word_count` - Number of words
|
|
- `crawled_at` - When content was crawled
|
|
|
|
RSS Feeds collection added:
|
|
- `name` - Feed name
|
|
- `url` - Feed URL (unique)
|
|
- `active` - Active status
|
|
- `created_at` - Creation timestamp
|
|
|
|
### Files Added
|
|
```
|
|
backend/
|
|
├── config.py
|
|
├── database.py
|
|
├── fix_duplicates.py
|
|
├── STRUCTURE.md
|
|
├── routes/
|
|
│ ├── __init__.py
|
|
│ ├── subscription_routes.py
|
|
│ ├── news_routes.py
|
|
│ ├── rss_routes.py
|
|
│ └── ollama_routes.py
|
|
└── services/
|
|
├── __init__.py
|
|
├── news_service.py
|
|
├── email_service.py
|
|
└── ollama_service.py
|
|
|
|
news_crawler/
|
|
├── crawler_service.py
|
|
├── test_crawler.py
|
|
├── requirements.txt
|
|
├── .gitignore
|
|
├── Dockerfile
|
|
├── docker-compose.yml
|
|
├── README.md
|
|
└── QUICKSTART.md
|
|
|
|
Root:
|
|
├── ARCHITECTURE.md
|
|
└── CHANGELOG.md
|
|
```
|
|
|
|
### Files Removed
|
|
- Old monolithic `backend/app.py` (replaced with modular version)
|
|
|
|
### Next Steps (Future Enhancements)
|
|
- [ ] Frontend UI for RSS feed management
|
|
- [ ] Automatic article summarization with Ollama
|
|
- [ ] Scheduled newsletter sending
|
|
- [ ] Article categorization and tagging
|
|
- [ ] Search functionality
|
|
- [ ] User preferences (categories, frequency)
|
|
- [ ] Analytics dashboard
|
|
- [ ] API rate limiting
|
|
- [ ] Caching layer (Redis)
|
|
- [ ] Message queue for crawler (Celery)
|