Files
Munich-news/.kiro/specs/ai-article-summarization/tasks.md
2025-11-10 19:13:33 +01:00

4.2 KiB

Implementation Plan

  • 1. Create Ollama client module

    • Create news_crawler/ollama_client.py with OllamaClient class
    • Implement summarize_article() method with prompt construction and API call
    • Implement is_available() method for health checks
    • Implement test_connection() method for diagnostics
    • Add timeout handling (30 seconds)
    • Add error handling for connection, timeout, and invalid responses
    • Requirements: 1.1, 1.2, 1.3, 1.4, 1.5, 4.1, 4.2, 4.3, 5.2
  • 2. Create configuration module for crawler

    • Create news_crawler/config.py with Config class
    • Load environment variables (OLLAMA_BASE_URL, OLLAMA_MODEL, OLLAMA_ENABLED, OLLAMA_API_KEY, OLLAMA_TIMEOUT)
    • Add validation for required configuration
    • Add default values for optional configuration
    • Requirements: 2.1, 2.2, 2.3, 2.4
  • 3. Integrate Ollama client into crawler service

    • Import OllamaClient in news_crawler/crawler_service.py
    • Initialize Ollama client at module level using Config
    • Modify crawl_rss_feed() to call summarization after content extraction
    • Add conditional logic to skip summarization if OLLAMA_ENABLED is false
    • Add error handling to continue processing if summarization fails
    • Add logging for summarization start, success, and failure
    • Add rate limiting delay after summarization
    • Requirements: 1.1, 1.2, 1.3, 1.4, 1.5, 2.3, 2.4, 4.1, 4.5, 5.1, 5.3, 6.1, 6.2, 6.3
  • 4. Update database schema and storage

    • Modify article document structure in crawl_rss_feed() to include:
      • summary field (AI-generated summary)
      • summary_word_count field
      • summarized_at field (timestamp)
    • Update MongoDB upsert logic to handle new fields
    • Add check to skip re-summarization if article already has summary
    • Requirements: 3.1, 3.2, 3.3, 3.4, 8.4
  • 5. Update backend API to return summaries

    • Modify backend/routes/news_routes.py GET /api/news endpoint
    • Add summary, summary_word_count, summarized_at fields to response
    • Add has_summary boolean field to indicate if AI summarization was performed
    • Modify GET /api/news/ endpoint to include summary fields
    • Add fallback to content preview if no summary exists
    • Requirements: 7.1, 7.2, 7.3, 7.4, 7.5, 8.1, 8.2, 8.3
  • 6. Update database schema documentation

    • Update backend/DATABASE_SCHEMA.md with new summary fields
    • Add example document showing summary fields
    • Document the summarization workflow
    • Requirements: 3.1, 3.2, 3.3
  • 7. Add environment variable configuration

    • Update backend/env.template with Ollama configuration
    • Add comments explaining each Ollama setting
    • Document default values
    • Requirements: 2.1, 2.2
  • 8. Create test script for Ollama integration

    • Create news_crawler/test_ollama.py to test Ollama connection
    • Test summarization with sample article
    • Test error handling (timeout, connection failure)
    • Display configuration and connection status
    • Requirements: 1.1, 1.2, 1.3, 1.4, 2.1, 2.2, 4.1, 4.2
  • 9. Update crawler statistics and logging

    • Add summarization statistics to final report in crawl_all_feeds()
    • Track total articles summarized vs failed
    • Log average summarization time
    • Display progress indicators during summarization
    • Requirements: 5.4, 6.1, 6.2, 6.3, 6.4, 6.5
  • 10. Create documentation for AI summarization

    • Create news_crawler/AI_SUMMARIZATION.md explaining the feature
    • Document configuration options
    • Provide troubleshooting guide
    • Add examples of usage
    • Requirements: 2.1, 2.2, 2.3, 2.4, 6.1, 6.2, 6.3
  • 11. Update main README with AI summarization info

    • Add section about AI summarization feature
    • Document Ollama setup requirements
    • Add configuration examples
    • Update API endpoint documentation
    • Requirements: 2.1, 2.2, 7.1, 7.2
  • 12. Test end-to-end workflow

    • Run crawler with Ollama enabled
    • Verify articles are summarized correctly
    • Check database contains all expected fields
    • Test API endpoints return summaries
    • Verify error handling when Ollama is disabled/unavailable
    • Requirements: 1.1, 1.2, 1.3, 1.4, 1.5, 3.1, 3.2, 3.3, 3.4, 4.1, 4.2, 4.3, 4.4, 4.5, 7.1, 7.2, 7.3, 7.4, 7.5, 8.1, 8.2, 8.3, 8.4, 8.5