408 lines
14 KiB
Markdown
408 lines
14 KiB
Markdown
# Email Tracking System Design
|
|
|
|
## Overview
|
|
|
|
The email tracking system enables Munich News Daily to measure subscriber engagement through email opens and link clicks. The system uses industry-standard techniques (tracking pixels and redirect URLs) while maintaining privacy compliance and performance.
|
|
|
|
## Architecture
|
|
|
|
### High-Level Components
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Newsletter System │
|
|
│ │
|
|
│ ┌──────────────┐ ┌──────────────┐ │
|
|
│ │ Sender │─────▶│ Tracking │ │
|
|
│ │ Service │ │ Generator │ │
|
|
│ └──────────────┘ └──────────────┘ │
|
|
│ │ │ │
|
|
│ │ ▼ │
|
|
│ │ ┌──────────────┐ │
|
|
│ │ │ MongoDB │ │
|
|
│ │ │ (tracking) │ │
|
|
│ │ └──────────────┘ │
|
|
│ ▼ │
|
|
│ ┌──────────────┐ │
|
|
│ │ Email │ │
|
|
│ │ Client │ │
|
|
│ └──────────────┘ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
│ ▲
|
|
│ │
|
|
▼ │
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Backend API Server │
|
|
│ │
|
|
│ ┌──────────────┐ ┌──────────────┐ │
|
|
│ │ Pixel │ │ Link │ │
|
|
│ │ Endpoint │ │ Redirect │ │
|
|
│ └──────────────┘ └──────────────┘ │
|
|
│ │ │ │
|
|
│ └──────────┬───────────┘ │
|
|
│ ▼ │
|
|
│ ┌──────────────┐ │
|
|
│ │ MongoDB │ │
|
|
│ │ (tracking) │ │
|
|
│ └──────────────┘ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Technology Stack
|
|
|
|
- **Backend**: Flask (Python) - existing backend server
|
|
- **Database**: MongoDB - existing database with new collections
|
|
- **Email**: SMTP (existing sender service)
|
|
- **Tracking**: UUID-based unique identifiers
|
|
- **Image**: 1x1 transparent PNG (base64 encoded)
|
|
|
|
## Components and Interfaces
|
|
|
|
### 1. Tracking ID Generator
|
|
|
|
**Purpose**: Generate unique tracking identifiers for emails and links
|
|
|
|
**Module**: `backend/services/tracking_service.py`
|
|
|
|
**Functions**:
|
|
```python
|
|
def generate_tracking_id() -> str:
|
|
"""Generate a unique tracking ID using UUID4"""
|
|
return str(uuid.uuid4())
|
|
|
|
def create_newsletter_tracking(newsletter_id: str, subscriber_email: str) -> dict:
|
|
"""Create tracking record for a newsletter send"""
|
|
# Returns tracking document with IDs for pixel and links
|
|
```
|
|
|
|
### 2. Tracking Pixel Endpoint
|
|
|
|
**Purpose**: Serve 1x1 transparent PNG and log email opens
|
|
|
|
**Endpoint**: `GET /api/track/pixel/<tracking_id>`
|
|
|
|
**Flow**:
|
|
1. Receive request with tracking_id
|
|
2. Look up tracking record in database
|
|
3. Log open event (email, timestamp, user-agent)
|
|
4. Return 1x1 transparent PNG image
|
|
5. Handle multiple opens (update last_opened_at)
|
|
|
|
**Response**:
|
|
- Status: 200 OK
|
|
- Content-Type: image/png
|
|
- Body: 1x1 transparent PNG (43 bytes)
|
|
|
|
### 3. Link Tracking Endpoint
|
|
|
|
**Purpose**: Track link clicks and redirect to original URL
|
|
|
|
**Endpoint**: `GET /api/track/click/<tracking_id>`
|
|
|
|
**Flow**:
|
|
1. Receive request with tracking_id
|
|
2. Look up tracking record and original URL
|
|
3. Log click event (email, article_url, timestamp, user-agent)
|
|
4. Redirect to original article URL (302 redirect)
|
|
5. Handle errors gracefully (redirect to homepage if invalid)
|
|
|
|
**Response**:
|
|
- Status: 302 Found
|
|
- Location: Original article URL
|
|
- Performance: < 200ms redirect time
|
|
|
|
### 4. Newsletter Template Modifier
|
|
|
|
**Purpose**: Inject tracking pixel and replace article links
|
|
|
|
**Module**: `news_sender/tracking_integration.py`
|
|
|
|
**Functions**:
|
|
```python
|
|
def inject_tracking_pixel(html: str, tracking_id: str, api_url: str) -> str:
|
|
"""Inject tracking pixel before closing </body> tag"""
|
|
pixel_url = f"{api_url}/api/track/pixel/{tracking_id}"
|
|
pixel_html = f'<img src="{pixel_url}" width="1" height="1" alt="" />'
|
|
return html.replace('</body>', f'{pixel_html}</body>')
|
|
|
|
def replace_article_links(html: str, articles: list, tracking_map: dict, api_url: str) -> str:
|
|
"""Replace article links with tracking URLs"""
|
|
# For each article link, replace with tracking URL
|
|
```
|
|
|
|
### 5. Analytics Service
|
|
|
|
**Purpose**: Calculate engagement metrics and identify active users
|
|
|
|
**Module**: `backend/services/analytics_service.py`
|
|
|
|
**Functions**:
|
|
```python
|
|
def get_open_rate(newsletter_id: str) -> float:
|
|
"""Calculate percentage of subscribers who opened"""
|
|
|
|
def get_click_rate(article_url: str) -> float:
|
|
"""Calculate percentage of subscribers who clicked"""
|
|
|
|
def get_subscriber_activity_status(email: str) -> str:
|
|
"""Return 'active', 'inactive', or 'dormant'"""
|
|
|
|
def update_subscriber_activity_statuses():
|
|
"""Batch update all subscriber activity statuses"""
|
|
```
|
|
|
|
## Data Models
|
|
|
|
### Newsletter Sends Collection (`newsletter_sends`)
|
|
|
|
Tracks each newsletter sent to each subscriber.
|
|
|
|
```javascript
|
|
{
|
|
_id: ObjectId,
|
|
newsletter_id: String, // Unique ID for this newsletter batch (date-based)
|
|
subscriber_email: String, // Recipient email
|
|
tracking_id: String, // Unique tracking ID for this send (UUID)
|
|
sent_at: DateTime, // When email was sent
|
|
opened: Boolean, // Whether email was opened
|
|
first_opened_at: DateTime, // First open timestamp (null if not opened)
|
|
last_opened_at: DateTime, // Most recent open timestamp
|
|
open_count: Number, // Number of times opened
|
|
created_at: DateTime // Record creation time
|
|
}
|
|
```
|
|
|
|
**Indexes**:
|
|
- `tracking_id` (unique) - Fast lookup for pixel requests
|
|
- `newsletter_id` - Analytics queries
|
|
- `subscriber_email` - User activity queries
|
|
- `sent_at` - Time-based queries
|
|
|
|
### Link Clicks Collection (`link_clicks`)
|
|
|
|
Tracks individual link clicks.
|
|
|
|
```javascript
|
|
{
|
|
_id: ObjectId,
|
|
tracking_id: String, // Unique tracking ID for this link (UUID)
|
|
newsletter_id: String, // Which newsletter this link was in
|
|
subscriber_email: String, // Who clicked
|
|
article_url: String, // Original article URL
|
|
article_title: String, // Article title for reporting
|
|
clicked_at: DateTime, // When link was clicked
|
|
user_agent: String, // Browser/client info
|
|
created_at: DateTime // Record creation time
|
|
}
|
|
```
|
|
|
|
**Indexes**:
|
|
- `tracking_id` (unique) - Fast lookup for redirect requests
|
|
- `newsletter_id` - Analytics queries
|
|
- `article_url` - Article performance queries
|
|
- `subscriber_email` - User activity queries
|
|
|
|
### Subscriber Activity Collection (`subscriber_activity`)
|
|
|
|
Aggregated activity status for each subscriber.
|
|
|
|
```javascript
|
|
{
|
|
_id: ObjectId,
|
|
email: String, // Subscriber email (unique)
|
|
status: String, // 'active', 'inactive', or 'dormant'
|
|
last_opened_at: DateTime, // Most recent email open
|
|
last_clicked_at: DateTime, // Most recent link click
|
|
total_opens: Number, // Lifetime open count
|
|
total_clicks: Number, // Lifetime click count
|
|
newsletters_received: Number, // Total newsletters sent
|
|
newsletters_opened: Number, // Total newsletters opened
|
|
updated_at: DateTime // Last status update
|
|
}
|
|
```
|
|
|
|
**Indexes**:
|
|
- `email` (unique) - Fast lookup
|
|
- `status` - Filter by activity level
|
|
- `last_opened_at` - Time-based queries
|
|
|
|
## Error Handling
|
|
|
|
### Tracking Pixel Failures
|
|
|
|
- **Invalid tracking_id**: Return 1x1 transparent PNG anyway (don't break email rendering)
|
|
- **Database error**: Log error, return pixel (fail silently)
|
|
- **Multiple opens**: Update existing record, don't create duplicate
|
|
|
|
### Link Redirect Failures
|
|
|
|
- **Invalid tracking_id**: Redirect to website homepage
|
|
- **Database error**: Log error, redirect to homepage
|
|
- **Missing original URL**: Redirect to homepage
|
|
|
|
### Privacy Compliance
|
|
|
|
- **Data retention**: Anonymize tracking data after 90 days
|
|
- Remove email addresses
|
|
- Keep aggregated metrics
|
|
- **Opt-out**: Check subscriber preferences before tracking
|
|
- **GDPR deletion**: Provide endpoint to delete all tracking data for a user
|
|
|
|
## Testing Strategy
|
|
|
|
### Unit Tests
|
|
|
|
1. **Tracking ID Generation**
|
|
- Test UUID format
|
|
- Test uniqueness
|
|
|
|
2. **Pixel Endpoint**
|
|
- Test valid tracking_id returns PNG
|
|
- Test invalid tracking_id returns PNG
|
|
- Test database logging
|
|
|
|
3. **Link Redirect**
|
|
- Test valid tracking_id redirects correctly
|
|
- Test invalid tracking_id redirects to homepage
|
|
- Test click logging
|
|
|
|
4. **Analytics Calculations**
|
|
- Test open rate calculation
|
|
- Test click rate calculation
|
|
- Test activity status classification
|
|
|
|
### Integration Tests
|
|
|
|
1. **End-to-End Newsletter Flow**
|
|
- Send newsletter with tracking
|
|
- Simulate email open (pixel request)
|
|
- Simulate link click
|
|
- Verify database records
|
|
|
|
2. **Privacy Compliance**
|
|
- Test data anonymization
|
|
- Test user data deletion
|
|
- Test opt-out handling
|
|
|
|
### Performance Tests
|
|
|
|
1. **Redirect Speed**
|
|
- Measure redirect time (target: < 200ms)
|
|
- Test under load (100 concurrent requests)
|
|
|
|
2. **Pixel Serving**
|
|
- Test pixel response time
|
|
- Test caching headers
|
|
|
|
## API Endpoints
|
|
|
|
### Tracking Endpoints
|
|
|
|
```
|
|
GET /api/track/pixel/<tracking_id>
|
|
- Returns: 1x1 transparent PNG
|
|
- Logs: Email open event
|
|
|
|
GET /api/track/click/<tracking_id>
|
|
- Returns: 302 redirect to article URL
|
|
- Logs: Link click event
|
|
```
|
|
|
|
### Analytics Endpoints
|
|
|
|
```
|
|
GET /api/analytics/newsletter/<newsletter_id>
|
|
- Returns: Open rate, click rate, engagement metrics
|
|
|
|
GET /api/analytics/article/<article_id>
|
|
- Returns: Click count, click rate for specific article
|
|
|
|
GET /api/analytics/subscriber/<email>
|
|
- Returns: Activity status, engagement history
|
|
|
|
POST /api/analytics/update-activity
|
|
- Triggers: Batch update of subscriber activity statuses
|
|
- Returns: Update count
|
|
```
|
|
|
|
### Privacy Endpoints
|
|
|
|
```
|
|
DELETE /api/tracking/subscriber/<email>
|
|
- Deletes: All tracking data for subscriber
|
|
- Returns: Deletion confirmation
|
|
|
|
POST /api/tracking/anonymize
|
|
- Triggers: Anonymize tracking data older than 90 days
|
|
- Returns: Anonymization count
|
|
```
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 1: Core Tracking (MVP)
|
|
- Tracking ID generation
|
|
- Pixel endpoint
|
|
- Link redirect endpoint
|
|
- Database collections
|
|
- Newsletter template integration
|
|
|
|
### Phase 2: Analytics
|
|
- Open rate calculation
|
|
- Click rate calculation
|
|
- Activity status classification
|
|
- Analytics API endpoints
|
|
|
|
### Phase 3: Privacy & Compliance
|
|
- Data anonymization
|
|
- User data deletion
|
|
- Opt-out handling
|
|
- Privacy notices
|
|
|
|
### Phase 4: Optimization
|
|
- Caching for pixel endpoint
|
|
- Performance monitoring
|
|
- Batch processing for activity updates
|
|
|
|
## Security Considerations
|
|
|
|
1. **Rate Limiting**: Prevent abuse of tracking endpoints
|
|
2. **Input Validation**: Validate all tracking_ids (UUID format)
|
|
3. **SQL Injection**: Use parameterized queries (MongoDB safe by default)
|
|
4. **Privacy**: Don't expose subscriber emails in URLs
|
|
5. **HTTPS**: Ensure all tracking URLs use HTTPS in production
|
|
|
|
## Configuration
|
|
|
|
Add to `backend/.env`:
|
|
|
|
```env
|
|
# Tracking Configuration
|
|
TRACKING_ENABLED=true
|
|
TRACKING_API_URL=http://localhost:5000
|
|
TRACKING_DATA_RETENTION_DAYS=90
|
|
```
|
|
|
|
## Monitoring and Metrics
|
|
|
|
### Key Metrics to Track
|
|
|
|
1. **Email Opens**
|
|
- Overall open rate
|
|
- Open rate by newsletter
|
|
- Time to first open
|
|
|
|
2. **Link Clicks**
|
|
- Overall click rate
|
|
- Click rate by article
|
|
- Click-through rate (CTR)
|
|
|
|
3. **Subscriber Engagement**
|
|
- Active subscriber count
|
|
- Inactive subscriber count
|
|
- Dormant subscriber count
|
|
|
|
4. **System Performance**
|
|
- Pixel response time
|
|
- Redirect response time
|
|
- Database query performance
|