Testing Guide

Comprehensive test suite with unit, integration, and E2E tests using pytest and uv

Test Infrastructure

AI Web Feeds includes a comprehensive test suite with 100+ tests covering unit, integration, and end-to-end scenarios. All tests use uv for fast, deterministic execution and pytest with advanced plugins.

Tests are organized to mirror the source code structure, making it easy to find and maintain tests.

Quick Start

# Quick test (recommended during development)
aiwebfeeds test quick

# All tests
aiwebfeeds test all

# With coverage
aiwebfeeds test coverage --open

# Unit tests only
aiwebfeeds test unit

# From tests directory
cd tests
uv run pytest -v

# From workspace root
uv run --directory tests pytest -v -m unit

# With coverage
uv run --directory tests pytest --cov=ai_web_feeds --cov-report=html

# Auto-rerun tests on file changes
aiwebfeeds test watch

Test Commands

Available Commands

Command	Description	Use Case
`test all`	Run all tests	Pre-commit, CI/CD
`test unit`	Unit tests only	Development
`test integration`	Integration tests	Feature testing
`test e2e`	E2E tests	Release validation
`test quick`	Fast unit tests	Rapid feedback
`test coverage`	With coverage report	Quality check
`test watch`	Auto-rerun on changes	TDD mode
`test file <path>`	Specific file	Focused testing
`test debug`	With debugger	Troubleshooting
`test markers`	List available markers	Discovery

Common Options

# Verbose output
aiwebfeeds test all --verbose

# Parallel execution
aiwebfeeds test all --parallel

# Coverage with HTML report
aiwebfeeds test coverage --open

# Skip slow tests
aiwebfeeds test unit --fast

# Filter by keyword
aiwebfeeds test file test_utils.py -k "twitter"

Test Structure

Tests are organized to mirror the source code:

tests/
├── packages/ai_web_feeds/
│   ├── unit/              # Fast, isolated tests
│   │   ├── test_models.py
│   │   ├── test_storage.py
│   │   ├── test_fetcher.py
│   │   ├── test_config.py
│   │   ├── test_utils.py
│   │   └── test_analytics.py
│   ├── integration/       # Multi-component tests
│   │   └── test_integration.py
│   └── e2e/              # Full workflow tests
│       └── test_workflows.py
└── apps/cli/
    ├── unit/
    │   └── test_commands.py
    └── integration/
        └── test_cli_integration.py

Test Categories

Unit Tests (`@pytest.mark.unit`)

Fast, isolated tests with no external dependencies:

Models: Data validation with property-based testing
Storage: Database CRUD operations
Fetcher: Feed fetching with mocking
Config: Configuration management
Utils: Utility functions (platform detection, URL generation)
Analytics: Analytics calculations

# Run unit tests
aiwebfeeds test unit

# Skip slow unit tests
aiwebfeeds test unit --fast

Integration Tests (`@pytest.mark.integration`)

Multi-component workflows:

Database + Fetcher integration
Complete fetch/parse/store workflow
Topic-feed relationships
CLI integration

# Run integration tests
aiwebfeeds test integration

E2E Tests (`@pytest.mark.e2e`)

Complete user workflows:

New user onboarding
Feed management
Bulk operations (100+ feeds)
Data export workflows
Performance testing (1000+ feeds)

# Run E2E tests
aiwebfeeds test e2e

Advanced Features

Property-Based Testing

Using Hypothesis for robust input testing:

from hypothesis import given, strategies as st

@given(st.text())
def test_sanitize_text_property_based(text):
    result = sanitize_text(text)
    assert isinstance(result, str)

Test Markers

Available markers for filtering tests:

unit - Unit tests (fast, no external dependencies)
integration - Integration tests (multiple components)
e2e - End-to-end tests (full workflows)
slow - Slow running tests
network - Tests requiring network access
database - Tests requiring database

# List all markers
aiwebfeeds test markers

# Run specific markers
uv run --directory tests pytest -m "unit and not slow"

Coverage Reporting

Generate coverage reports:

# HTML + terminal report
aiwebfeeds test coverage

# Open in browser
aiwebfeeds test coverage --open

# Coverage reports are saved to tests/reports/coverage/

Configuration

All pytest configuration is in tests/pyproject.toml:

[tool.pytest.ini_options]
testpaths = ["."]
markers = [
    "unit: Unit tests (fast, no external dependencies)",
    "integration: Integration tests (multiple components)",
    "e2e: End-to-end tests (full workflows)",
    "slow: Slow running tests",
    "network: Tests requiring network access",
    "database: Tests requiring database",
]

CI/CD Integration

For continuous integration:

# Comprehensive CI test
aiwebfeeds test all --coverage --parallel

# Or directly with pytest
uv run --directory tests pytest -v --cov=ai_web_feeds --cov-report=html

Debugging Tests

Debug Mode

Run tests with pdb debugger:

# Debug all tests
aiwebfeeds test debug

# Debug specific file
aiwebfeeds test debug packages/ai_web_feeds/unit/test_models.py

Verbose Output

# Very verbose
aiwebfeeds test all -vv

# Show local variables
uv run --directory tests pytest --showlocals

Web Integration Testing

Follow these steps to verify the AI & LLM integration is working correctly.

Prerequisites

Start Development Server

pnpm dev

Wait for the server to be ready at http://localhost:3000

Open Terminal

You'll need a terminal for running test commands.

All tests assume the development server is running on http://localhost:3000.

Test Discovery Endpoint

`/llms.txt`

Visit: http://localhost:3000/llms.txt

You should see a plain text file listing all documentation pages.

curl http://localhost:3000/llms.txt

# AI Web Feeds Documentation

> A collection of curated RSS/Atom feeds optimized for AI agents

## Documentation Pages

- [Getting Started](http://localhost:3000/docs.mdx): Overview...
- [PDF Export](http://localhost:3000/docs/features/pdf-export.mdx): Export...
...

Verify Headers

curl -I http://localhost:3000/llms.txt

Expected Headers:

Content-Type: text/plain; charset=utf-8
Cache-Control: public, max-age=3600, s-maxage=86400

Test Full Documentation

`/llms-full.txt`

Visit: http://localhost:3000/llms-full.txt

You should see all documentation in a structured format.

curl http://localhost:3000/llms-full.txt

================================================================================
AI WEB FEEDS - COMPLETE DOCUMENTATION
================================================================================

METADATA
--------------------------------------------------------------------------------
Generated: 2025-10-14T12:00:00.000Z
Total Pages: 5
Base URL: http://localhost:3000

Table of Contents:
  1. Getting Started - /docs
  2. PDF Export - /docs/features/pdf-export
  ...

================================================================================
PAGE 1 OF 5
================================================================================

TITLE: Getting Started
URL: http://localhost:3000/docs
...

Verify Custom Headers

curl -I http://localhost:3000/llms-full.txt | grep "X-"

Expected:

X-Content-Pages: 5
X-Generated-Date: 2025-10-14T12:00:00.000Z

Download and Inspect

# Download
curl http://localhost:3000/llms-full.txt -o docs.txt

# Check file size
wc -l docs.txt

# View header
head -50 docs.txt

# View table of contents
sed -n '/Table of Contents:/,/^===/p' docs.txt

# Count pages
grep -c "^PAGE [0-9]" docs.txt

Test Markdown Extensions

`.mdx` Extension

Visit: http://localhost:3000/docs.mdx

You should see markdown content with Content-Type: text/markdown.

curl http://localhost:3000/docs.mdx

# Check content type
curl -I http://localhost:3000/docs.mdx | grep "Content-Type"

# Expected:
# Content-Type: text/markdown; charset=utf-8

`.md` Extension

# Test alternative extension
curl http://localhost:3000/docs.md

# Should return same content as .mdx

Test Nested Pages

# Test feature pages
curl http://localhost:3000/docs/features/pdf-export.mdx
curl http://localhost:3000/docs/features/ai-integration.mdx

# Test guide pages
curl http://localhost:3000/docs/guides/quick-reference.mdx
curl http://localhost:3000/docs/guides/testing.mdx

Test Content Negotiation

With Accept Header

# Request markdown via header
curl -H "Accept: text/markdown" http://localhost:3000/docs

Expected: Markdown content (same as /docs.mdx)

With Browser Accept Header

# Request HTML (default)
curl -H "Accept: text/html" http://localhost:3000/docs

Expected: HTML page with full layout

Verify Rewrite

# Check status and headers
curl -I -H "Accept: text/markdown" http://localhost:3000/docs

Expected:

Status: 200 OK
Content-Type: text/markdown

Test RSS Feeds

Sitewide Feeds

# Test RSS feed
curl http://localhost:3000/rss.xml | head -50

# Check content type
curl -I http://localhost:3000/rss.xml | grep "Content-Type"

Expected: Content-Type: application/rss+xml

# Test Atom feed
curl http://localhost:3000/atom.xml | head -50

# Check content type
curl -I http://localhost:3000/atom.xml | grep "Content-Type"

Expected: Content-Type: application/atom+xml

# Test JSON feed
curl http://localhost:3000/feed.json | jq

# Check content type
curl -I http://localhost:3000/feed.json | grep "Content-Type"

Expected: Content-Type: application/json

Documentation Feeds

# Test documentation RSS feed
curl http://localhost:3000/docs/rss.xml | head -50

# Test documentation Atom feed
curl http://localhost:3000/docs/atom.xml | head -50

# Test documentation JSON feed
curl http://localhost:3000/docs/feed.json | jq .items

Verify Feed Discovery

Check that feeds are discoverable in HTML:

# View HTML head
curl http://localhost:3000 | grep -i "alternate" | grep -i "rss\|atom\|json"

# Expected output includes:
# <link rel="alternate" type="application/rss+xml" ... />
# <link rel="alternate" type="application/atom+xml" ... />
# <link rel="alternate" type="application/json" ... />

Validate Feed Format

Use the W3C Feed Validator:

Open Validator

Visit https://validator.w3.org/feed/

Enter Feed URL

Use your local or deployed feed URL:

http://localhost:3000/rss.xml
http://localhost:3000/docs/rss.xml

Check Validation

Click "Check" and review results

Test Page Actions UI

Visual Test

Navigate to Docs

Open http://localhost:3000/docs in your browser

Locate Page Actions

Look for the section below the page title with:

"Copy Markdown" button
View options dropdown button (with chevron icon)

Test Copy Button

Click Copy Button

Click the "Copy Markdown" button

Observe Behavior

Button should show loading state briefly
Button should show checkmark when done
No errors in console

Verify Clipboard

Paste clipboard content into a text editor

Expected: Markdown source of the page

Test View Options

Click the view options dropdown button

Verify Options

Check that these options appear:

Open in GitHub
Open in Scira AI
Open in Perplexity
Open in ChatGPT

Test Link

Click "Open in GitHub"

Expected: Opens correct GitHub file path

Update GitHub URLs in app/docs/[[...slug]]/page.tsx to match your repository path.

Test Error Handling

Non-Existent Page

curl http://localhost:3000/docs/non-existent.mdx

Expected: 404 error

Invalid Path

curl http://localhost:3000/llms.mdx

Expected: Appropriate error handling

Production Build Test

Build the Site

pnpm build

Expected: Build completes successfully without errors

Validate Links

pnpm lint:links

Expected: No broken links found

Start Production Server

pnpm start

Test Static Generation

# Check generated files
ls -la .next/server/app/llms.mdx/

# Verify static generation
find .next/server/app -name "*.html"

Test All Endpoints

# Test discovery
curl http://localhost:3000/llms.txt

# Test full docs
curl http://localhost:3000/llms-full.txt

# Test markdown pages
curl http://localhost:3000/docs.mdx
curl http://localhost:3000/docs/features/pdf-export.mdx

Performance Testing

Check Caching Headers

curl -I http://localhost:3000/llms.txt | grep -i cache

Expected:

Cache-Control: public, max-age=3600, s-maxage=86400

curl -I http://localhost:3000/llms-full.txt | grep -i cache

Expected:

Cache-Control: public, max-age=0, must-revalidate

curl -I http://localhost:3000/docs.mdx | grep -i cache

Expected:

Cache-Control: public, max-age=31536000, immutable

Measure Response Time

# Time discovery endpoint
time curl -s http://localhost:3000/llms.txt > /dev/null

# Time full docs
time curl -s http://localhost:3000/llms-full.txt > /dev/null

# Time markdown page
time curl -s http://localhost:3000/docs.mdx > /dev/null

Integration Test (AI Agent Simulation)

Simulate Discovery Flow

# Step 1: Discover documentation
curl http://localhost:3000/llms.txt

# Step 2: Get specific page
curl http://localhost:3000/docs.mdx

# Step 3: Use content negotiation
curl -H "Accept: text/markdown" http://localhost:3000/docs

Expected: All three methods should work seamlessly

Simulate RAG System

# Get all documentation for embedding
curl http://localhost:3000/llms-full.txt > docs.txt

# Verify file size is reasonable
wc -l docs.txt
du -h docs.txt

# Check structure
head -100 docs.txt
tail -50 docs.txt

Browser DevTools Test

Network Tab

Open DevTools

Press F12 or Cmd+Option+I (Mac)

Navigate to Network Tab

Click the "Network" tab

Visit Page

Go to http://localhost:3000/docs

Click Copy Button

Click "Copy Markdown" button

Observe Request

Check for:

Request to /docs.mdx
Status: 200 OK
Type: text/markdown

Console Tab

Open Console

Click the "Console" tab in DevTools

Check for Errors

Expected: No errors in console

Test Copy Function

Click "Copy Markdown" button

Expected: No errors, success feedback shown

Checklist

Run through this checklist to verify everything works:

Troubleshooting

Endpoint Not Found

Clear the .next cache and rebuild: bash rm -rf .next/ pnpm dev

Markdown Not Returning

Check source.config.ts:

includeProcessedMarkdown: true; // Must be present

Copy Button Not Working

Check browser console for errors. Verify:

Button component imported correctly
markdownUrl prop provided
Fetch API available

GitHub Link Incorrect

Update in app/docs/[[...slug]]/page.tsx:

githubUrl={`https://github.com/wyattowalsh/ai-web-feeds/blob/main/apps/web/content/docs/${page.file.path}`}

Headers Missing

Verify in route files:

return new Response(content, {
  headers: {
    "Content-Type": "text/plain; charset=utf-8",
    "X-Content-Pages": pages.length.toString(),
    "X-Generated-Date": new Date().toISOString(),
  },
});

RSS Feed Not Found

Check route file exists:

ls -la app/rss.xml/route.ts
ls -la app/docs/rss.xml/route.ts

If missing, recreate or check build output.

RSS Feed Empty

Verify source.getPages() returns pages:

// In lib/rss.ts
const pages = source.getPages();
console.log("Pages found:", pages.length);

Invalid RSS/Atom XML

Ensure special characters are HTML-encoded
Validate with W3C Feed Validator
Check for proper UTF-8 encoding

Next Steps

Once all tests pass:

Customize - Update GitHub URLs, add more AI tools
Deploy - Push to production
Monitor - Check analytics and usage
Iterate - Gather feedback and improve

Quick Reference - Commands and endpoints
AI Integration - Complete AI guide
llms-full.txt Format - Format specification

Testing Guide

On this page