update

2025-11-11 14:09:21 +01:00
parent bcd0a10576
commit 1075a91eac
57 changed files with 5598 additions and 1366 deletions
--- a/news_sender/Dockerfile
+++ b/news_sender/Dockerfile
@@ -0,0 +1,24 @@
+FROM python:3.11-slim
+
+WORKDIR /app
+
+# Install dependencies
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+
+# Copy sender files
+COPY . .
+
+# Copy backend files (needed for tracking and config)
+COPY ../backend/services /app/backend/services
+COPY ../backend/.env /app/.env
+
+# Make the scheduler executable
+RUN chmod +x scheduled_sender.py
+
+# Set timezone to Berlin
+ENV TZ=Europe/Berlin
+RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
+
+# Run the scheduled sender
+CMD ["python", "-u", "scheduled_sender.py"]
--- a/news_sender/README.md
+++ b/news_sender/README.md
@@ -1,303 +0,0 @@
-# News Sender Microservice
-
-Standalone service for sending Munich News Daily newsletters to subscribers.
-
-## Features
-
- 📧 Sends beautiful HTML newsletters
- 🤖 Uses AI-generated article summaries
- 📊 Tracks sending statistics
- 🧪 Test mode for development
- 📝 Preview generation
- 🔄 Fetches data from shared MongoDB
-
-## Installation
-
-```bash
-cd news_sender
-pip install -r requirements.txt
-```
-
-## Configuration
-
-The service uses the same `.env` file as the backend (`../backend/.env`):
-
-```env
-# MongoDB
-MONGODB_URI=mongodb://localhost:27017/
-
-# Email (Gmail example)
-SMTP_SERVER=smtp.gmail.com
-SMTP_PORT=587
-EMAIL_USER=your-email@gmail.com
-EMAIL_PASSWORD=your-app-password
-
-# Newsletter Settings (optional)
-NEWSLETTER_MAX_ARTICLES=10
-WEBSITE_URL=http://localhost:3000
-```
-
-**Gmail Setup:**
-1. Enable 2-factor authentication
-2. Generate an App Password: https://support.google.com/accounts/answer/185833
-3. Use the App Password (not your regular password)
-
-## Usage
-
-### 1. Preview Newsletter
-
-Generate HTML preview without sending:
-
-```bash
-python sender_service.py preview
-```
-
-This creates `newsletter_preview.html` - open it in your browser to see how the newsletter looks.
-
-### 2. Send Test Email
-
-Send to a single email address for testing:
-
-```bash
-python sender_service.py test your-email@example.com
-```
-
-### 3. Send to All Subscribers
-
-Send newsletter to all active subscribers:
-
-```bash
-# Send with default article count (10)
-python sender_service.py send
-
-# Send with custom article count
-python sender_service.py send 15
-```
-
-### 4. Use as Python Module
-
-```python
-from sender_service import send_newsletter, preview_newsletter
-
-# Send newsletter
-result = send_newsletter(max_articles=10)
-print(f"Sent to {result['sent_count']} subscribers")
-
-# Generate preview
-html = preview_newsletter(max_articles=5)
-```
-
-## How It Works
-
-```
-┌─────────────────────────────────────────────────────────┐
-│  1. Fetch Articles from MongoDB                         │
-│     - Get latest articles with AI summaries             │
-│     - Sort by creation date (newest first)              │
-└─────────────────────────────────────────────────────────┘
-                         ↓
-┌─────────────────────────────────────────────────────────┐
-│  2. Fetch Active Subscribers                            │
-│     - Get all subscribers with status='active'          │
-└─────────────────────────────────────────────────────────┘
-                         ↓
-┌─────────────────────────────────────────────────────────┐
-│  3. Render Newsletter HTML                              │
-│     - Load newsletter_template.html                     │
-│     - Populate with articles and metadata               │
-│     - Generate beautiful HTML email                     │
-└─────────────────────────────────────────────────────────┘
-                         ↓
-┌─────────────────────────────────────────────────────────┐
-│  4. Send Emails                                         │
-│     - Connect to SMTP server                            │
-│     - Send to each subscriber                           │
-│     - Track success/failure                             │
-└─────────────────────────────────────────────────────────┘
-                         ↓
-┌─────────────────────────────────────────────────────────┐
-│  5. Report Statistics                                   │
-│     - Total sent                                        │
-│     - Failed sends                                      │
-│     - Error details                                     │
-└─────────────────────────────────────────────────────────┘
-```
-
-## Output Example
-
-```
-======================================================================
-📧 Munich News Daily - Newsletter Sender
-======================================================================
-
-Fetching latest 10 articles with AI summaries...
-✓ Found 10 articles
-
-Fetching active subscribers...
-✓ Found 150 active subscriber(s)
-
-Rendering newsletter HTML...
-✓ Newsletter rendered
-
-Sending newsletter: 'Munich News Daily - November 10, 2024'
----------------------------------------------------------------------
-[1/150] Sending to user1@example.com... ✓
-[2/150] Sending to user2@example.com... ✓
-[3/150] Sending to user3@example.com... ✓
-...
-
-======================================================================
-📊 Sending Complete
-======================================================================
-✓ Successfully sent: 148
-✗ Failed: 2
-📰 Articles included: 10
-======================================================================
-```
-
-## Scheduling
-
-### Using Cron (Linux/Mac)
-
-Send newsletter daily at 8 AM:
-
-```bash
-# Edit crontab
-crontab -e
-
-# Add this line
-0 8 * * * cd /path/to/news_sender && /path/to/venv/bin/python sender_service.py send
-```
-
-### Using systemd Timer (Linux)
-
-Create `/etc/systemd/system/news-sender.service`:
-
-```ini
-[Unit]
-Description=Munich News Sender
-
-[Service]
-Type=oneshot
-WorkingDirectory=/path/to/news_sender
-ExecStart=/path/to/venv/bin/python sender_service.py send
-User=your-user
-```
-
-Create `/etc/systemd/system/news-sender.timer`:
-
-```ini
-[Unit]
-Description=Send Munich News Daily at 8 AM
-
-[Timer]
-OnCalendar=daily
-OnCalendar=*-*-* 08:00:00
-
-[Install]
-WantedBy=timers.target
-```
-
-Enable and start:
-
-```bash
-sudo systemctl enable news-sender.timer
-sudo systemctl start news-sender.timer
-```
-
-### Using Docker
-
-Create `Dockerfile`:
-
-```dockerfile
-FROM python:3.11-slim
-
-WORKDIR /app
-
-COPY requirements.txt .
-RUN pip install --no-cache-dir -r requirements.txt
-
-COPY sender_service.py newsletter_template.html ./
-
-CMD ["python", "sender_service.py", "send"]
-```
-
-Build and run:
-
-```bash
-docker build -t news-sender .
-docker run --env-file ../backend/.env news-sender
-```
-
-## Troubleshooting
-
-### "Email credentials not configured"
- Check that `EMAIL_USER` and `EMAIL_PASSWORD` are set in `.env`
- For Gmail, use an App Password, not your regular password
-
-### "No articles with summaries found"
- Run the crawler first: `cd ../news_crawler && python crawler_service.py 10`
- Make sure Ollama is enabled and working
- Check MongoDB has articles with `summary` field
-
-### "No active subscribers found"
- Add subscribers via the backend API
- Check subscriber status is 'active' in MongoDB
-
-### SMTP Connection Errors
- Verify SMTP server and port are correct
- Check firewall isn't blocking SMTP port
- For Gmail, ensure "Less secure app access" is enabled or use App Password
-
-### Emails Going to Spam
- Set up SPF, DKIM, and DMARC records for your domain
- Use a verified email address
- Avoid spam trigger words in subject/content
- Include unsubscribe link (already included in template)
-
-## Architecture
-
-This is a standalone microservice that:
- Runs independently of the backend
- Shares the same MongoDB database
- Can be deployed separately
- Can be scheduled independently
- Has no dependencies on backend code
-
-## Integration with Other Services
-
-```
-┌──────────────┐     ┌──────────────┐     ┌──────────────┐
-│   Backend    │     │   Crawler    │     │    Sender    │
-│   (Flask)    │     │  (Scraper)   │     │   (Email)    │
-└──────┬───────┘     └──────┬───────┘     └──────┬───────┘
-       │                    │                     │
-       │                    │                     │
-       └────────────────────┴─────────────────────┘
-                            │
-                    ┌───────▼────────┐
-                    │    MongoDB     │
-                    │  (Shared DB)   │
-                    └────────────────┘
-```
-
-## Next Steps
-
-1. **Test the newsletter:**
-   ```bash
-   python sender_service.py test your-email@example.com
-   ```
-
-2. **Schedule daily sending:**
-   - Set up cron job or systemd timer
-   - Choose appropriate time (e.g., 8 AM)
-
-3. **Monitor sending:**
-   - Check logs for errors
-   - Track open rates (requires email tracking service)
-   - Monitor spam complaints
-
-4. **Optimize:**
-   - Add email tracking pixels
-   - A/B test subject lines
-   - Personalize content per subscriber
--- a/news_sender/newsletter_template.html
+++ b/news_sender/newsletter_template.html
@@ -146,6 +146,14 @@
                                <a href="{{ unsubscribe_link }}" style="color: #999999; text-decoration: none;">Unsubscribe</a>
                            </p>
                            
+                            {% if tracking_enabled %}
+                            <!-- Privacy Notice -->
+                            <p style="margin: 20px 0 0 0; font-size: 11px; color: #666666; line-height: 1.4;">
+                                This email contains tracking to measure engagement and improve our content.<br>
+                                We respect your privacy and anonymize data after 90 days.
+                            </p>
+                            {% endif %}
+                            
                            <p style="margin: 20px 0 0 0; font-size: 11px; color: #666666;">
                                © {{ year }} Munich News Daily. All rights reserved.
                            </p>
--- a/news_sender/requirements.txt
+++ b/news_sender/requirements.txt
@@ -1,3 +1,6 @@
 pymongo==4.6.1
 python-dotenv==1.0.0
 Jinja2==3.1.2
+beautifulsoup4==4.12.2
+schedule==1.2.0
+pytz==2023.3
--- a/news_sender/scheduled_sender.py
+++ b/news_sender/scheduled_sender.py
@@ -0,0 +1,178 @@
+#!/usr/bin/env python3
+"""
+Scheduled newsletter sender that runs daily at 7 AM Berlin time
+Waits for crawler to finish before sending to ensure fresh content
+"""
+import schedule
+import time
+from datetime import datetime, timedelta
+import pytz
+from pathlib import Path
+import sys
+
+# Add current directory to path
+sys.path.insert(0, str(Path(__file__).parent))
+
+from sender_service import send_newsletter, get_latest_articles, Config
+
+# Berlin timezone
+BERLIN_TZ = pytz.timezone('Europe/Berlin')
+
+# Maximum time to wait for crawler (in minutes)
+MAX_WAIT_TIME = 30
+
+def check_crawler_finished():
+    """
+    Check if crawler has finished by looking for recent articles
+    Returns: (bool, str) - (is_finished, message)
+    """
+    try:
+        # Check if we have articles from today
+        articles = get_latest_articles(max_articles=1, hours=2)
+        
+        if articles:
+            # Check if the most recent article was crawled recently (within last 2 hours)
+            latest_article = articles[0]
+            crawled_at = latest_article.get('crawled_at')
+            
+            if crawled_at:
+                time_since_crawl = datetime.utcnow() - crawled_at
+                minutes_since = time_since_crawl.total_seconds() / 60
+                
+                if minutes_since < 120:  # Within last 2 hours
+                    return True, f"Crawler finished {int(minutes_since)} minutes ago"
+        
+        return False, "No recent articles found"
+        
+    except Exception as e:
+        return False, f"Error checking crawler status: {e}"
+
+
+def wait_for_crawler(max_wait_minutes=30):
+    """
+    Wait for crawler to finish before sending newsletter
+    
+    Args:
+        max_wait_minutes: Maximum time to wait in minutes
+        
+    Returns:
+        bool: True if crawler finished, False if timeout
+    """
+    berlin_time = datetime.now(BERLIN_TZ)
+    print(f"\n⏳ Waiting for crawler to finish...")
+    print(f"   Current time: {berlin_time.strftime('%H:%M:%S %Z')}")
+    print(f"   Max wait time: {max_wait_minutes} minutes")
+    
+    start_time = time.time()
+    check_interval = 30  # Check every 30 seconds
+    
+    while True:
+        elapsed_minutes = (time.time() - start_time) / 60
+        
+        # Check if crawler finished
+        is_finished, message = check_crawler_finished()
+        
+        if is_finished:
+            print(f"   ✓ {message}")
+            return True
+        
+        # Check if we've exceeded max wait time
+        if elapsed_minutes >= max_wait_minutes:
+            print(f"   ⚠ Timeout after {max_wait_minutes} minutes")
+            print(f"   Proceeding with available articles...")
+            return False
+        
+        # Show progress
+        remaining = max_wait_minutes - elapsed_minutes
+        print(f"   ⏳ Still waiting... ({remaining:.1f} minutes remaining) - {message}")
+        
+        # Wait before next check
+        time.sleep(check_interval)
+
+
+def run_sender():
+    """Run the newsletter sender with crawler coordination"""
+    berlin_time = datetime.now(BERLIN_TZ)
+    print(f"\n{'='*70}")
+    print(f"📧 Scheduled newsletter sender started")
+    print(f"   Time: {berlin_time.strftime('%Y-%m-%d %H:%M:%S %Z')}")
+    print(f"{'='*70}\n")
+    
+    try:
+        # Wait for crawler to finish (max 30 minutes)
+        crawler_finished = wait_for_crawler(max_wait_minutes=MAX_WAIT_TIME)
+        
+        if not crawler_finished:
+            print(f"\n⚠ Crawler may still be running, but proceeding anyway...")
+        
+        print(f"\n{'='*70}")
+        print(f"📧 Starting newsletter send...")
+        print(f"{'='*70}\n")
+        
+        # Send newsletter to all subscribers
+        result = send_newsletter(max_articles=Config.MAX_ARTICLES)
+        
+        if result['success']:
+            print(f"\n{'='*70}")
+            print(f"✅ Newsletter sent successfully!")
+            print(f"   Sent: {result['sent_count']}/{result['total_subscribers']}")
+            print(f"   Articles: {result['article_count']}")
+            print(f"   Failed: {result['failed_count']}")
+            print(f"{'='*70}\n")
+        else:
+            print(f"\n{'='*70}")
+            print(f"❌ Newsletter send failed: {result.get('error', 'Unknown error')}")
+            print(f"{'='*70}\n")
+        
+    except Exception as e:
+        print(f"\n{'='*70}")
+        print(f"❌ Scheduled sender error: {e}")
+        print(f"{'='*70}\n")
+        import traceback
+        traceback.print_exc()
+
+
+def main():
+    """Main scheduler loop"""
+    print("📧 Munich News Newsletter Scheduler")
+    print("="*70)
+    print("Schedule: Daily at 7:00 AM Berlin time")
+    print("Timezone: Europe/Berlin (CET/CEST)")
+    print("Coordination: Waits for crawler to finish (max 30 min)")
+    print("="*70)
+    
+    # Schedule the sender to run at 7 AM Berlin time
+    schedule.every().day.at("07:00").do(run_sender)
+    
+    # Show next run time
+    berlin_time = datetime.now(BERLIN_TZ)
+    print(f"\nCurrent time (Berlin): {berlin_time.strftime('%Y-%m-%d %H:%M:%S %Z')}")
+    
+    # Get next scheduled run
+    next_run = schedule.next_run()
+    if next_run:
+        # Convert to Berlin time for display
+        next_run_berlin = next_run.astimezone(BERLIN_TZ)
+        print(f"Next scheduled run: {next_run_berlin.strftime('%Y-%m-%d %H:%M:%S %Z')}")
+    
+    print("\n⏳ Scheduler is running... (Press Ctrl+C to stop)\n")
+    
+    # Optional: Run immediately on startup (comment out if you don't want this)
+    # print("🚀 Running initial send on startup...")
+    # run_sender()
+    
+    # Keep the scheduler running
+    while True:
+        schedule.run_pending()
+        time.sleep(60)  # Check every minute
+
+
+if __name__ == '__main__':
+    try:
+        main()
+    except KeyboardInterrupt:
+        print("\n\n👋 Scheduler stopped by user")
+    except Exception as e:
+        print(f"\n\n❌ Scheduler error: {e}")
+        import traceback
+        traceback.print_exc()
--- a/news_sender/sender_service.py
+++ b/news_sender/sender_service.py
@@ -11,8 +11,17 @@ from pathlib import Path
 from jinja2 import Template
 from pymongo import MongoClient
 import os
+import sys
 from dotenv import load_dotenv

+# Add backend directory to path for importing tracking service
+backend_dir = Path(__file__).parent.parent / 'backend'
+sys.path.insert(0, str(backend_dir))
+
+# Import tracking modules
+from services import tracking_service
+from tracking_integration import inject_tracking_pixel, replace_article_links, generate_tracking_urls
+
 # Load environment variables from backend/.env
 backend_dir = Path(__file__).parent.parent / 'backend'
 env_path = backend_dir / '.env'
@@ -40,6 +49,11 @@ class Config:
    MAX_ARTICLES = int(os.getenv('NEWSLETTER_MAX_ARTICLES', '10'))
    HOURS_LOOKBACK = int(os.getenv('NEWSLETTER_HOURS_LOOKBACK', '24'))
    WEBSITE_URL = os.getenv('WEBSITE_URL', 'http://localhost:3000')
+    
+    # Tracking
+    TRACKING_ENABLED = os.getenv('TRACKING_ENABLED', 'true').lower() == 'true'
+    TRACKING_API_URL = os.getenv('TRACKING_API_URL', 'http://localhost:5001')
+    TRACKING_DATA_RETENTION_DAYS = int(os.getenv('TRACKING_DATA_RETENTION_DAYS', '90'))


 # MongoDB connection
@@ -117,15 +131,20 @@ def get_active_subscribers():
    return [doc['email'] for doc in cursor]


-def render_newsletter_html(articles):
+def render_newsletter_html(articles, tracking_enabled=False, pixel_tracking_id=None, 
+                          link_tracking_map=None, api_url=None):
    """
-    Render newsletter HTML from template
+    Render newsletter HTML from template with optional tracking integration
    
    Args:
        articles: List of article dictionaries
+        tracking_enabled: Whether to inject tracking pixel and replace links
+        pixel_tracking_id: Tracking ID for the email open pixel
+        link_tracking_map: Dictionary mapping original URLs to tracking IDs
+        api_url: Base URL for the tracking API
        
    Returns:
-        str: Rendered HTML content
+        str: Rendered HTML content with tracking injected if enabled
    """
    # Load template
    template_path = Path(__file__).parent / 'newsletter_template.html'
@@ -142,11 +161,23 @@ def render_newsletter_html(articles):
        'article_count': len(articles),
        'articles': articles,
        'unsubscribe_link': f'{Config.WEBSITE_URL}/unsubscribe',
-        'website_link': Config.WEBSITE_URL
+        'website_link': Config.WEBSITE_URL,
+        'tracking_enabled': tracking_enabled
    }
    
    # Render HTML
-    return template.render(**template_data)
+    html = template.render(**template_data)
+    
+    # Inject tracking if enabled
+    if tracking_enabled and pixel_tracking_id and api_url:
+        # Inject tracking pixel
+        html = inject_tracking_pixel(html, pixel_tracking_id, api_url)
+        
+        # Replace article links with tracking URLs
+        if link_tracking_map:
+            html = replace_article_links(html, link_tracking_map, api_url)
+    
+    return html


 def send_email(to_email, subject, html_content):
@@ -246,14 +277,14 @@ def send_newsletter(max_articles=None, test_email=None):
            'error': 'No active subscribers'
        }
    
-    # Render newsletter
-    print("\nRendering newsletter HTML...")
-    html_content = render_newsletter_html(articles)
-    print("✓ Newsletter rendered")
+    # Generate newsletter ID (date-based)
+    newsletter_id = f"newsletter-{datetime.now().strftime('%Y-%m-%d')}"
    
    # Send to subscribers
    subject = f"Munich News Daily - {datetime.now().strftime('%B %d, %Y')}"
    print(f"\nSending newsletter: '{subject}'")
+    print(f"Newsletter ID: {newsletter_id}")
+    print(f"Tracking enabled: {Config.TRACKING_ENABLED}")
    print("-" * 70)
    
    sent_count = 0
@@ -262,6 +293,34 @@ def send_newsletter(max_articles=None, test_email=None):
    
    for i, email in enumerate(subscribers, 1):
        print(f"[{i}/{len(subscribers)}] Sending to {email}...", end=' ')
+        
+        # Generate tracking data for this subscriber if tracking is enabled
+        if Config.TRACKING_ENABLED:
+            try:
+                tracking_data = generate_tracking_urls(
+                    articles=articles,
+                    newsletter_id=newsletter_id,
+                    subscriber_email=email,
+                    tracking_service=tracking_service
+                )
+                
+                # Render newsletter with tracking
+                html_content = render_newsletter_html(
+                    articles=articles,
+                    tracking_enabled=True,
+                    pixel_tracking_id=tracking_data['pixel_tracking_id'],
+                    link_tracking_map=tracking_data['link_tracking_map'],
+                    api_url=Config.TRACKING_API_URL
+                )
+            except Exception as e:
+                print(f"⚠ Tracking error: {e}, sending without tracking...", end=' ')
+                # Fallback: send without tracking
+                html_content = render_newsletter_html(articles)
+        else:
+            # Render newsletter without tracking
+            html_content = render_newsletter_html(articles)
+        
+        # Send email
        success, error = send_email(email, subject, html_content)
        
        if success:
@@ -310,12 +369,11 @@ def preview_newsletter(max_articles=None, hours=None):
        today_date = datetime.now().strftime('%B %d, %Y')
        return f"<h1>No articles from today found</h1><p>No articles published today ({today_date}). Run the crawler with Ollama enabled to get fresh content.</p>"
    
-    return render_newsletter_html(articles)
+    # Preview without tracking
+    return render_newsletter_html(articles, tracking_enabled=False)


 if __name__ == '__main__':
-    import sys
-    
    # Parse command line arguments
    if len(sys.argv) > 1:
        command = sys.argv[1]
--- a/news_sender/tracking_integration.py
+++ b/news_sender/tracking_integration.py
@@ -0,0 +1,150 @@
+"""
+Tracking integration module for Munich News Daily newsletter system.
+Handles injection of tracking pixels and replacement of article links with tracking URLs.
+"""
+
+import re
+from typing import Dict, List
+from bs4 import BeautifulSoup
+
+
+def inject_tracking_pixel(html: str, tracking_id: str, api_url: str) -> str:
+    """
+    Inject tracking pixel into newsletter HTML before closing </body> tag.
+    
+    The tracking pixel is a 1x1 transparent image that loads when the email is opened,
+    allowing us to track email opens.
+    
+    Args:
+        html: Original newsletter HTML content
+        tracking_id: Unique tracking ID for this newsletter send (None if tracking disabled)
+        api_url: Base URL for the tracking API (e.g., http://localhost:5001)
+        
+    Returns:
+        str: HTML with tracking pixel injected (unchanged if tracking_id is None)
+        
+    Example:
+        >>> html = '<html><body><p>Content</p></body></html>'
+        >>> inject_tracking_pixel(html, 'abc-123', 'http://api.example.com')
+        '<html><body><p>Content</p><img src="http://api.example.com/api/track/pixel/abc-123" width="1" height="1" alt="" /></body></html>'
+    """
+    # Skip tracking if no tracking_id provided (subscriber opted out)
+    if not tracking_id:
+        return html
+    
+    # Construct tracking pixel URL
+    pixel_url = f"{api_url}/api/track/pixel/{tracking_id}"
+    
+    # Create tracking pixel HTML
+    pixel_html = f'<img src="{pixel_url}" width="1" height="1" alt="" style="display:block;" />'
+    
+    # Inject pixel before closing </body> tag
+    if '</body>' in html:
+        html = html.replace('</body>', f'{pixel_html}</body>')
+    else:
+        # Fallback: append to end if no </body> tag found
+        html += pixel_html
+    
+    return html
+
+
+def replace_article_links(
+    html: str,
+    link_tracking_map: Dict[str, str],
+    api_url: str
+) -> str:
+    """
+    Replace article links in newsletter HTML with tracking URLs.
+    
+    Finds all article links in the HTML and replaces them with tracking redirect URLs
+    that log clicks before redirecting to the original article.
+    
+    Args:
+        html: Original newsletter HTML content
+        link_tracking_map: Dictionary mapping original URLs to tracking IDs (empty if tracking disabled)
+        api_url: Base URL for the tracking API (e.g., http://localhost:5001)
+        
+    Returns:
+        str: HTML with article links replaced by tracking URLs (unchanged if map is empty)
+        
+    Example:
+        >>> html = '<a href="https://example.com/article">Read</a>'
+        >>> mapping = {'https://example.com/article': 'track-123'}
+        >>> replace_article_links(html, mapping, 'http://api.example.com')
+        '<a href="http://api.example.com/api/track/click/track-123">Read</a>'
+    """
+    # Skip tracking if no tracking map provided (subscriber opted out)
+    if not link_tracking_map:
+        return html
+    
+    # Parse HTML with BeautifulSoup
+    soup = BeautifulSoup(html, 'html.parser')
+    
+    # Find all <a> tags with href attributes
+    for link in soup.find_all('a', href=True):
+        original_url = link['href']
+        
+        # Check if this URL should be tracked
+        if original_url in link_tracking_map:
+            tracking_id = link_tracking_map[original_url]
+            tracking_url = f"{api_url}/api/track/click/{tracking_id}"
+            
+            # Replace the href with tracking URL
+            link['href'] = tracking_url
+    
+    # Return modified HTML
+    return str(soup)
+
+
+def generate_tracking_urls(
+    articles: List[Dict],
+    newsletter_id: str,
+    subscriber_email: str,
+    tracking_service
+) -> Dict[str, str]:
+    """
+    Generate tracking records for all article links and return URL mapping.
+    
+    Creates tracking records in the database for each article link and returns
+    a mapping of original URLs to tracking IDs.
+    
+    Args:
+        articles: List of article dictionaries with 'link' and 'title' keys
+        newsletter_id: Unique identifier for the newsletter batch
+        subscriber_email: Email address of the recipient
+        tracking_service: Tracking service module with create_newsletter_tracking function
+        
+    Returns:
+        dict: Dictionary containing:
+            - pixel_tracking_id: ID for the tracking pixel
+            - link_tracking_map: Dict mapping original URLs to tracking IDs
+            
+    Example:
+        >>> articles = [{'link': 'https://example.com/1', 'title': 'Article 1'}]
+        >>> generate_tracking_urls(articles, 'news-2024-01-01', 'user@example.com', tracking_service)
+        {
+            'pixel_tracking_id': 'uuid-for-pixel',
+            'link_tracking_map': {'https://example.com/1': 'uuid-for-link'}
+        }
+    """
+    # Prepare article links for tracking
+    article_links = []
+    for article in articles:
+        if 'link' in article and article['link']:
+            article_links.append({
+                'url': article['link'],
+                'title': article.get('title', '')
+            })
+    
+    # Create tracking records using the tracking service
+    tracking_data = tracking_service.create_newsletter_tracking(
+        newsletter_id=newsletter_id,
+        subscriber_email=subscriber_email,
+        article_links=article_links
+    )
+    
+    return {
+        'pixel_tracking_id': tracking_data['pixel_tracking_id'],
+        'link_tracking_map': tracking_data['link_tracking_map'],
+        'tracking_enabled': tracking_data.get('tracking_enabled', True)
+    }