slight update
This commit is contained in:
407
.kiro/specs/email-tracking/design.md
Normal file
407
.kiro/specs/email-tracking/design.md
Normal file
@@ -0,0 +1,407 @@
|
||||
# Email Tracking System Design
|
||||
|
||||
## Overview
|
||||
|
||||
The email tracking system enables Munich News Daily to measure subscriber engagement through email opens and link clicks. The system uses industry-standard techniques (tracking pixels and redirect URLs) while maintaining privacy compliance and performance.
|
||||
|
||||
## Architecture
|
||||
|
||||
### High-Level Components
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Newsletter System │
|
||||
│ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ Sender │─────▶│ Tracking │ │
|
||||
│ │ Service │ │ Generator │ │
|
||||
│ └──────────────┘ └──────────────┘ │
|
||||
│ │ │ │
|
||||
│ │ ▼ │
|
||||
│ │ ┌──────────────┐ │
|
||||
│ │ │ MongoDB │ │
|
||||
│ │ │ (tracking) │ │
|
||||
│ │ └──────────────┘ │
|
||||
│ ▼ │
|
||||
│ ┌──────────────┐ │
|
||||
│ │ Email │ │
|
||||
│ │ Client │ │
|
||||
│ └──────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│ ▲
|
||||
│ │
|
||||
▼ │
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Backend API Server │
|
||||
│ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ Pixel │ │ Link │ │
|
||||
│ │ Endpoint │ │ Redirect │ │
|
||||
│ └──────────────┘ └──────────────┘ │
|
||||
│ │ │ │
|
||||
│ └──────────┬───────────┘ │
|
||||
│ ▼ │
|
||||
│ ┌──────────────┐ │
|
||||
│ │ MongoDB │ │
|
||||
│ │ (tracking) │ │
|
||||
│ └──────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Technology Stack
|
||||
|
||||
- **Backend**: Flask (Python) - existing backend server
|
||||
- **Database**: MongoDB - existing database with new collections
|
||||
- **Email**: SMTP (existing sender service)
|
||||
- **Tracking**: UUID-based unique identifiers
|
||||
- **Image**: 1x1 transparent PNG (base64 encoded)
|
||||
|
||||
## Components and Interfaces
|
||||
|
||||
### 1. Tracking ID Generator
|
||||
|
||||
**Purpose**: Generate unique tracking identifiers for emails and links
|
||||
|
||||
**Module**: `backend/services/tracking_service.py`
|
||||
|
||||
**Functions**:
|
||||
```python
|
||||
def generate_tracking_id() -> str:
|
||||
"""Generate a unique tracking ID using UUID4"""
|
||||
return str(uuid.uuid4())
|
||||
|
||||
def create_newsletter_tracking(newsletter_id: str, subscriber_email: str) -> dict:
|
||||
"""Create tracking record for a newsletter send"""
|
||||
# Returns tracking document with IDs for pixel and links
|
||||
```
|
||||
|
||||
### 2. Tracking Pixel Endpoint
|
||||
|
||||
**Purpose**: Serve 1x1 transparent PNG and log email opens
|
||||
|
||||
**Endpoint**: `GET /api/track/pixel/<tracking_id>`
|
||||
|
||||
**Flow**:
|
||||
1. Receive request with tracking_id
|
||||
2. Look up tracking record in database
|
||||
3. Log open event (email, timestamp, user-agent)
|
||||
4. Return 1x1 transparent PNG image
|
||||
5. Handle multiple opens (update last_opened_at)
|
||||
|
||||
**Response**:
|
||||
- Status: 200 OK
|
||||
- Content-Type: image/png
|
||||
- Body: 1x1 transparent PNG (43 bytes)
|
||||
|
||||
### 3. Link Tracking Endpoint
|
||||
|
||||
**Purpose**: Track link clicks and redirect to original URL
|
||||
|
||||
**Endpoint**: `GET /api/track/click/<tracking_id>`
|
||||
|
||||
**Flow**:
|
||||
1. Receive request with tracking_id
|
||||
2. Look up tracking record and original URL
|
||||
3. Log click event (email, article_url, timestamp, user-agent)
|
||||
4. Redirect to original article URL (302 redirect)
|
||||
5. Handle errors gracefully (redirect to homepage if invalid)
|
||||
|
||||
**Response**:
|
||||
- Status: 302 Found
|
||||
- Location: Original article URL
|
||||
- Performance: < 200ms redirect time
|
||||
|
||||
### 4. Newsletter Template Modifier
|
||||
|
||||
**Purpose**: Inject tracking pixel and replace article links
|
||||
|
||||
**Module**: `news_sender/tracking_integration.py`
|
||||
|
||||
**Functions**:
|
||||
```python
|
||||
def inject_tracking_pixel(html: str, tracking_id: str, api_url: str) -> str:
|
||||
"""Inject tracking pixel before closing </body> tag"""
|
||||
pixel_url = f"{api_url}/api/track/pixel/{tracking_id}"
|
||||
pixel_html = f'<img src="{pixel_url}" width="1" height="1" alt="" />'
|
||||
return html.replace('</body>', f'{pixel_html}</body>')
|
||||
|
||||
def replace_article_links(html: str, articles: list, tracking_map: dict, api_url: str) -> str:
|
||||
"""Replace article links with tracking URLs"""
|
||||
# For each article link, replace with tracking URL
|
||||
```
|
||||
|
||||
### 5. Analytics Service
|
||||
|
||||
**Purpose**: Calculate engagement metrics and identify active users
|
||||
|
||||
**Module**: `backend/services/analytics_service.py`
|
||||
|
||||
**Functions**:
|
||||
```python
|
||||
def get_open_rate(newsletter_id: str) -> float:
|
||||
"""Calculate percentage of subscribers who opened"""
|
||||
|
||||
def get_click_rate(article_url: str) -> float:
|
||||
"""Calculate percentage of subscribers who clicked"""
|
||||
|
||||
def get_subscriber_activity_status(email: str) -> str:
|
||||
"""Return 'active', 'inactive', or 'dormant'"""
|
||||
|
||||
def update_subscriber_activity_statuses():
|
||||
"""Batch update all subscriber activity statuses"""
|
||||
```
|
||||
|
||||
## Data Models
|
||||
|
||||
### Newsletter Sends Collection (`newsletter_sends`)
|
||||
|
||||
Tracks each newsletter sent to each subscriber.
|
||||
|
||||
```javascript
|
||||
{
|
||||
_id: ObjectId,
|
||||
newsletter_id: String, // Unique ID for this newsletter batch (date-based)
|
||||
subscriber_email: String, // Recipient email
|
||||
tracking_id: String, // Unique tracking ID for this send (UUID)
|
||||
sent_at: DateTime, // When email was sent
|
||||
opened: Boolean, // Whether email was opened
|
||||
first_opened_at: DateTime, // First open timestamp (null if not opened)
|
||||
last_opened_at: DateTime, // Most recent open timestamp
|
||||
open_count: Number, // Number of times opened
|
||||
created_at: DateTime // Record creation time
|
||||
}
|
||||
```
|
||||
|
||||
**Indexes**:
|
||||
- `tracking_id` (unique) - Fast lookup for pixel requests
|
||||
- `newsletter_id` - Analytics queries
|
||||
- `subscriber_email` - User activity queries
|
||||
- `sent_at` - Time-based queries
|
||||
|
||||
### Link Clicks Collection (`link_clicks`)
|
||||
|
||||
Tracks individual link clicks.
|
||||
|
||||
```javascript
|
||||
{
|
||||
_id: ObjectId,
|
||||
tracking_id: String, // Unique tracking ID for this link (UUID)
|
||||
newsletter_id: String, // Which newsletter this link was in
|
||||
subscriber_email: String, // Who clicked
|
||||
article_url: String, // Original article URL
|
||||
article_title: String, // Article title for reporting
|
||||
clicked_at: DateTime, // When link was clicked
|
||||
user_agent: String, // Browser/client info
|
||||
created_at: DateTime // Record creation time
|
||||
}
|
||||
```
|
||||
|
||||
**Indexes**:
|
||||
- `tracking_id` (unique) - Fast lookup for redirect requests
|
||||
- `newsletter_id` - Analytics queries
|
||||
- `article_url` - Article performance queries
|
||||
- `subscriber_email` - User activity queries
|
||||
|
||||
### Subscriber Activity Collection (`subscriber_activity`)
|
||||
|
||||
Aggregated activity status for each subscriber.
|
||||
|
||||
```javascript
|
||||
{
|
||||
_id: ObjectId,
|
||||
email: String, // Subscriber email (unique)
|
||||
status: String, // 'active', 'inactive', or 'dormant'
|
||||
last_opened_at: DateTime, // Most recent email open
|
||||
last_clicked_at: DateTime, // Most recent link click
|
||||
total_opens: Number, // Lifetime open count
|
||||
total_clicks: Number, // Lifetime click count
|
||||
newsletters_received: Number, // Total newsletters sent
|
||||
newsletters_opened: Number, // Total newsletters opened
|
||||
updated_at: DateTime // Last status update
|
||||
}
|
||||
```
|
||||
|
||||
**Indexes**:
|
||||
- `email` (unique) - Fast lookup
|
||||
- `status` - Filter by activity level
|
||||
- `last_opened_at` - Time-based queries
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Tracking Pixel Failures
|
||||
|
||||
- **Invalid tracking_id**: Return 1x1 transparent PNG anyway (don't break email rendering)
|
||||
- **Database error**: Log error, return pixel (fail silently)
|
||||
- **Multiple opens**: Update existing record, don't create duplicate
|
||||
|
||||
### Link Redirect Failures
|
||||
|
||||
- **Invalid tracking_id**: Redirect to website homepage
|
||||
- **Database error**: Log error, redirect to homepage
|
||||
- **Missing original URL**: Redirect to homepage
|
||||
|
||||
### Privacy Compliance
|
||||
|
||||
- **Data retention**: Anonymize tracking data after 90 days
|
||||
- Remove email addresses
|
||||
- Keep aggregated metrics
|
||||
- **Opt-out**: Check subscriber preferences before tracking
|
||||
- **GDPR deletion**: Provide endpoint to delete all tracking data for a user
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Unit Tests
|
||||
|
||||
1. **Tracking ID Generation**
|
||||
- Test UUID format
|
||||
- Test uniqueness
|
||||
|
||||
2. **Pixel Endpoint**
|
||||
- Test valid tracking_id returns PNG
|
||||
- Test invalid tracking_id returns PNG
|
||||
- Test database logging
|
||||
|
||||
3. **Link Redirect**
|
||||
- Test valid tracking_id redirects correctly
|
||||
- Test invalid tracking_id redirects to homepage
|
||||
- Test click logging
|
||||
|
||||
4. **Analytics Calculations**
|
||||
- Test open rate calculation
|
||||
- Test click rate calculation
|
||||
- Test activity status classification
|
||||
|
||||
### Integration Tests
|
||||
|
||||
1. **End-to-End Newsletter Flow**
|
||||
- Send newsletter with tracking
|
||||
- Simulate email open (pixel request)
|
||||
- Simulate link click
|
||||
- Verify database records
|
||||
|
||||
2. **Privacy Compliance**
|
||||
- Test data anonymization
|
||||
- Test user data deletion
|
||||
- Test opt-out handling
|
||||
|
||||
### Performance Tests
|
||||
|
||||
1. **Redirect Speed**
|
||||
- Measure redirect time (target: < 200ms)
|
||||
- Test under load (100 concurrent requests)
|
||||
|
||||
2. **Pixel Serving**
|
||||
- Test pixel response time
|
||||
- Test caching headers
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Tracking Endpoints
|
||||
|
||||
```
|
||||
GET /api/track/pixel/<tracking_id>
|
||||
- Returns: 1x1 transparent PNG
|
||||
- Logs: Email open event
|
||||
|
||||
GET /api/track/click/<tracking_id>
|
||||
- Returns: 302 redirect to article URL
|
||||
- Logs: Link click event
|
||||
```
|
||||
|
||||
### Analytics Endpoints
|
||||
|
||||
```
|
||||
GET /api/analytics/newsletter/<newsletter_id>
|
||||
- Returns: Open rate, click rate, engagement metrics
|
||||
|
||||
GET /api/analytics/article/<article_id>
|
||||
- Returns: Click count, click rate for specific article
|
||||
|
||||
GET /api/analytics/subscriber/<email>
|
||||
- Returns: Activity status, engagement history
|
||||
|
||||
POST /api/analytics/update-activity
|
||||
- Triggers: Batch update of subscriber activity statuses
|
||||
- Returns: Update count
|
||||
```
|
||||
|
||||
### Privacy Endpoints
|
||||
|
||||
```
|
||||
DELETE /api/tracking/subscriber/<email>
|
||||
- Deletes: All tracking data for subscriber
|
||||
- Returns: Deletion confirmation
|
||||
|
||||
POST /api/tracking/anonymize
|
||||
- Triggers: Anonymize tracking data older than 90 days
|
||||
- Returns: Anonymization count
|
||||
```
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 1: Core Tracking (MVP)
|
||||
- Tracking ID generation
|
||||
- Pixel endpoint
|
||||
- Link redirect endpoint
|
||||
- Database collections
|
||||
- Newsletter template integration
|
||||
|
||||
### Phase 2: Analytics
|
||||
- Open rate calculation
|
||||
- Click rate calculation
|
||||
- Activity status classification
|
||||
- Analytics API endpoints
|
||||
|
||||
### Phase 3: Privacy & Compliance
|
||||
- Data anonymization
|
||||
- User data deletion
|
||||
- Opt-out handling
|
||||
- Privacy notices
|
||||
|
||||
### Phase 4: Optimization
|
||||
- Caching for pixel endpoint
|
||||
- Performance monitoring
|
||||
- Batch processing for activity updates
|
||||
|
||||
## Security Considerations
|
||||
|
||||
1. **Rate Limiting**: Prevent abuse of tracking endpoints
|
||||
2. **Input Validation**: Validate all tracking_ids (UUID format)
|
||||
3. **SQL Injection**: Use parameterized queries (MongoDB safe by default)
|
||||
4. **Privacy**: Don't expose subscriber emails in URLs
|
||||
5. **HTTPS**: Ensure all tracking URLs use HTTPS in production
|
||||
|
||||
## Configuration
|
||||
|
||||
Add to `backend/.env`:
|
||||
|
||||
```env
|
||||
# Tracking Configuration
|
||||
TRACKING_ENABLED=true
|
||||
TRACKING_API_URL=http://localhost:5000
|
||||
TRACKING_DATA_RETENTION_DAYS=90
|
||||
```
|
||||
|
||||
## Monitoring and Metrics
|
||||
|
||||
### Key Metrics to Track
|
||||
|
||||
1. **Email Opens**
|
||||
- Overall open rate
|
||||
- Open rate by newsletter
|
||||
- Time to first open
|
||||
|
||||
2. **Link Clicks**
|
||||
- Overall click rate
|
||||
- Click rate by article
|
||||
- Click-through rate (CTR)
|
||||
|
||||
3. **Subscriber Engagement**
|
||||
- Active subscriber count
|
||||
- Inactive subscriber count
|
||||
- Dormant subscriber count
|
||||
|
||||
4. **System Performance**
|
||||
- Pixel response time
|
||||
- Redirect response time
|
||||
- Database query performance
|
||||
Reference in New Issue
Block a user