update
This commit is contained in:
136
docs/CHANGELOG.md
Normal file
136
docs/CHANGELOG.md
Normal file
@@ -0,0 +1,136 @@
|
||||
# Changelog
|
||||
|
||||
## [Unreleased] - 2024-11-10
|
||||
|
||||
### Added - Major Refactoring
|
||||
|
||||
#### Backend Modularization
|
||||
- ✅ Restructured backend into modular architecture
|
||||
- ✅ Created separate route blueprints:
|
||||
- `subscription_routes.py` - User subscriptions
|
||||
- `news_routes.py` - News fetching and stats
|
||||
- `rss_routes.py` - RSS feed management (CRUD)
|
||||
- `ollama_routes.py` - AI integration
|
||||
- ✅ Created service layer:
|
||||
- `news_service.py` - News fetching logic
|
||||
- `email_service.py` - Newsletter sending
|
||||
- `ollama_service.py` - AI communication
|
||||
- ✅ Centralized configuration in `config.py`
|
||||
- ✅ Separated database logic in `database.py`
|
||||
- ✅ Reduced main `app.py` from 700+ lines to 27 lines
|
||||
|
||||
#### RSS Feed Management
|
||||
- ✅ Dynamic RSS feed management via API
|
||||
- ✅ Add/remove/list/toggle RSS feeds without code changes
|
||||
- ✅ Unique index on RSS feed URLs (prevents duplicates)
|
||||
- ✅ Default feeds auto-initialized on first run
|
||||
- ✅ Created `fix_duplicates.py` utility script
|
||||
|
||||
#### News Crawler Microservice
|
||||
- ✅ Created standalone `news_crawler/` microservice
|
||||
- ✅ Web scraping with BeautifulSoup
|
||||
- ✅ Smart content extraction using multiple selectors
|
||||
- ✅ Full article content storage in MongoDB
|
||||
- ✅ Word count calculation
|
||||
- ✅ Duplicate prevention (skips already-crawled articles)
|
||||
- ✅ Rate limiting (1 second between requests)
|
||||
- ✅ Can run independently or scheduled
|
||||
- ✅ Docker support for crawler
|
||||
- ✅ Comprehensive documentation
|
||||
|
||||
#### API Endpoints
|
||||
New endpoints added:
|
||||
- `GET /api/rss-feeds` - List all RSS feeds
|
||||
- `POST /api/rss-feeds` - Add new RSS feed
|
||||
- `DELETE /api/rss-feeds/<id>` - Remove RSS feed
|
||||
- `PATCH /api/rss-feeds/<id>/toggle` - Toggle feed active status
|
||||
|
||||
#### Documentation
|
||||
- ✅ Created `ARCHITECTURE.md` - System architecture overview
|
||||
- ✅ Created `backend/STRUCTURE.md` - Backend structure guide
|
||||
- ✅ Created `news_crawler/README.md` - Crawler documentation
|
||||
- ✅ Created `news_crawler/QUICKSTART.md` - Quick start guide
|
||||
- ✅ Created `news_crawler/test_crawler.py` - Test suite
|
||||
- ✅ Updated main `README.md` with new features
|
||||
- ✅ Updated `DATABASE_SCHEMA.md` with new fields
|
||||
|
||||
#### Configuration
|
||||
- ✅ Added `FLASK_PORT` environment variable
|
||||
- ✅ Fixed `OLLAMA_MODEL` typo in `.env`
|
||||
- ✅ Port 5001 default to avoid macOS AirPlay conflict
|
||||
|
||||
### Changed
|
||||
- Backend structure: Monolithic → Modular
|
||||
- RSS feeds: Hardcoded → Database-driven
|
||||
- Article storage: Summary only → Full content support
|
||||
- Configuration: Scattered → Centralized
|
||||
|
||||
### Technical Improvements
|
||||
- Separation of concerns (routes vs services)
|
||||
- Better testability
|
||||
- Easier maintenance
|
||||
- Scalable architecture
|
||||
- Independent microservices
|
||||
- Proper error handling
|
||||
- Comprehensive logging
|
||||
|
||||
### Database Schema Updates
|
||||
Articles collection now includes:
|
||||
- `full_content` - Full article text
|
||||
- `word_count` - Number of words
|
||||
- `crawled_at` - When content was crawled
|
||||
|
||||
RSS Feeds collection added:
|
||||
- `name` - Feed name
|
||||
- `url` - Feed URL (unique)
|
||||
- `active` - Active status
|
||||
- `created_at` - Creation timestamp
|
||||
|
||||
### Files Added
|
||||
```
|
||||
backend/
|
||||
├── config.py
|
||||
├── database.py
|
||||
├── fix_duplicates.py
|
||||
├── STRUCTURE.md
|
||||
├── routes/
|
||||
│ ├── __init__.py
|
||||
│ ├── subscription_routes.py
|
||||
│ ├── news_routes.py
|
||||
│ ├── rss_routes.py
|
||||
│ └── ollama_routes.py
|
||||
└── services/
|
||||
├── __init__.py
|
||||
├── news_service.py
|
||||
├── email_service.py
|
||||
└── ollama_service.py
|
||||
|
||||
news_crawler/
|
||||
├── crawler_service.py
|
||||
├── test_crawler.py
|
||||
├── requirements.txt
|
||||
├── .gitignore
|
||||
├── Dockerfile
|
||||
├── docker-compose.yml
|
||||
├── README.md
|
||||
└── QUICKSTART.md
|
||||
|
||||
Root:
|
||||
├── ARCHITECTURE.md
|
||||
└── CHANGELOG.md
|
||||
```
|
||||
|
||||
### Files Removed
|
||||
- Old monolithic `backend/app.py` (replaced with modular version)
|
||||
|
||||
### Next Steps (Future Enhancements)
|
||||
- [ ] Frontend UI for RSS feed management
|
||||
- [ ] Automatic article summarization with Ollama
|
||||
- [ ] Scheduled newsletter sending
|
||||
- [ ] Article categorization and tagging
|
||||
- [ ] Search functionality
|
||||
- [ ] User preferences (categories, frequency)
|
||||
- [ ] Analytics dashboard
|
||||
- [ ] API rate limiting
|
||||
- [ ] Caching layer (Redis)
|
||||
- [ ] Message queue for crawler (Celery)
|
||||
Reference in New Issue
Block a user