Overview
AI Web Feeds development architecture and implementation
Development Overview
AI Web Feeds is a comprehensive system for managing AI/ML feed sources with database persistence, enrichment, and OPML generation.
What We Built
A production-ready system with the following capabilities:
1. Database Layer (aiwebfeeds.db)
Technology: SQLModel + SQLAlchemy + Alembic
Tables:
feed_sources- Core feed metadatafeed_items- Individual feed entriesfeed_fetch_logs- Fetch attempt trackingtopics- Topic taxonomy
Features:
- Full CRUD operations
- Relationship management
- Migration support via Alembic
- JSON field support for flexible data
2. Feed Enrichment Pipeline (feeds.enriched.yaml)
Capabilities:
- Automatic feed URL discovery from site URLs
- Feed format detection (RSS/Atom/JSONFeed)
- Metadata validation and enrichment
- Quality scoring and curation tracking
Input: data/feeds.yaml (human-curated)
Output: data/feeds.enriched.yaml (fully enriched with automation data)
3. Schema Management (feeds.enriched.schema.json)
Features:
- Auto-generated JSON Schema for enriched feeds
- Comprehensive validation rules
- Extends base
feeds.schema.json - Supports all enrichment metadata
4. OPML Generation
Formats:
- all.opml - Flat list of all feeds
- categorized.opml - Organized by source type
- Custom filtered - By topic, type, tag, verification status
Use Case: Import into feed readers (Feedly, Inoreader, NetNewsWire, etc.)
5. CLI Interface
Commands:
aiwebfeeds enrich all # Enrich feeds
aiwebfeeds opml all # Generate all.opml
aiwebfeeds opml categorized # Generate categorized.opml
aiwebfeeds opml filtered # Generate custom filtered OPML
aiwebfeeds stats show # Display statisticsPackage Structure
ai-web-feeds (workspace root)
├── packages/ai_web_feeds/ # Core library
│ └── src/ai_web_feeds/
│ ├── models.py # SQLModel tables + Pydantic models
│ ├── storage.py # Database manager
│ ├── utils.py # Enrichment, OPML, schema utils
│ ├── config.py # Configuration
│ └── logger.py # Logging setup
│
└── apps/cli/ # CLI application
└── ai_web_feeds/cli/
├── __init__.py # Main CLI app
└── commands/
├── enrich.py # Enrichment commands
├── opml.py # OPML generation
├── stats.py # Statistics
├── export.py # Export (stub)
└── validate.py # Validation (stub)Data Flow
feeds.yaml (human-curated)
↓
├─→ Feed Discovery (if discover: true)
├─→ Format Detection
├─→ Metadata Validation
└─→ Enrichment
↓
├─→ feeds.enriched.yaml (YAML export)
├─→ feeds.enriched.schema.json (JSON schema)
└─→ aiwebfeeds.db (SQLite database)
↓
├─→ all.opml (all feeds)
├─→ categorized.opml (by type)
└─→ filtered.opml (custom filters)Next Steps
- Database Setup - Learn about the database layer
- CLI Usage - Using the command-line interface
- Python API - Using the Python API
- Contributing - How to contribute