Twitter/X and arXiv Integration
Generate RSS feeds from Twitter/X and arXiv for AI research tracking
Overview
AI Web Feeds provides native integrations for Twitter/X and arXiv, enabling you to track AI researchers, discussions, and papers through RSS feeds.
Twitter/X Integration
Supported Feed Types
Get tweets from a specific user.
- id: "karpathy-twitter"
site: "https://twitter.com/karpathy"
title: "Andrej Karpathy on Twitter"
topics: ["ai", "ml", "research"]
source_type: "twitter"
mediums: ["text"]
platform_config:
platform: "twitter"
twitter:
username: "karpathy"
nitter_instance: "nitter.net" # Optional, defaults to nitter.netGenerated Feed URL: https://nitter.net/karpathy/rss
Get tweets from a Twitter list.
- id: "ai-researchers-list"
site: "https://twitter.com/i/lists/1234567890"
title: "AI Researchers List"
topics: ["ai", "research"]
source_type: "twitter"
platform_config:
platform: "twitter"
twitter:
list_id: "1234567890"Generated Feed URL: https://nitter.net/i/lists/1234567890/rss
Get tweets matching a search query.
- id: "twitter-llm-search"
site: "https://twitter.com/search"
title: "Twitter Search - LLM discussions"
topics: ["llm", "community"]
source_type: "twitter"
platform_config:
platform: "twitter"
twitter:
search_query: "LLM OR large language model"Generated Feed URL: https://nitter.net/search/rss?q=LLM+OR+large+language+model
Configuration Schema
The platform_config.twitter object supports:
| Field | Type | Description |
|---|---|---|
username | string | Twitter username (without @) |
list_id | string | Twitter list ID |
search_query | string | Twitter search query |
nitter_instance | string | Nitter instance URL (default: nitter.net) |
Alternative Nitter Instances
For reliability, you can use different Nitter instances:
nitter.net(default)nitter.privacy.com.denitter.1d4.usnitter.kavin.rocks
arXiv Integration
Supported Feed Types
RSS feeds for specific arXiv categories.
- id: "arxiv-cs-lg"
site: "https://arxiv.org/list/cs.LG/recent"
title: "arXiv - Computer Science - Machine Learning"
topics: ["research", "papers", "ml"]
source_type: "arxiv"
mediums: ["text"]
platform_config:
platform: "arxiv"
arxiv:
category: "cs.LG"Generated Feed URL: http://export.arxiv.org/rss/cs.LG
Advanced search capabilities.
- id: "arxiv-transformer-search"
site: "https://arxiv.org"
title: "arXiv - Transformer papers"
topics: ["research", "nlp"]
source_type: "arxiv"
platform_config:
platform: "arxiv"
arxiv:
search_query: "all:transformer AND all:attention"
max_results: 100Generated Feed URL: http://export.arxiv.org/api/query?search_query=all:transformer+AND+all:attention&max_results=100&sortBy=submittedDate&sortOrder=descending
Configuration Schema
The platform_config.arxiv object supports:
| Field | Type | Description |
|---|---|---|
category | string | arXiv category (e.g., cs.LG, stat.ML) |
author | string | Author name for author-specific feeds |
search_query | string | Advanced search query |
max_results | integer | Maximum number of results (default: 50) |
Popular arXiv Categories for AI/ML
cs.LG- Machine Learningcs.AI- Artificial Intelligencecs.CL- Computation and Language (NLP)cs.CV- Computer Vision and Pattern Recognitioncs.NE- Neural and Evolutionary Computingstat.ML- Machine Learning (Statistics)cs.RO- Roboticscs.IR- Information Retrieval
arXiv Search Syntax
When using search_query, you can use arXiv's advanced search:
au:author_name- Author searchti:title_words- Title searchabs:abstract_words- Abstract searchall:keywords- Search all fields- Use
AND,OR,ANDNOTfor boolean queries
Example: all:transformer AND cat:cs.LG
Implementation Details
Platform Detection
The system automatically detects Twitter/X and arXiv URLs:
Twitter/X domains:
twitter.com,www.twitter.comx.com,www.x.com
arXiv domains:
arxiv.org,www.arxiv.orgexport.arxiv.org
Feed URL Generation
Platform-specific generators:
generate_twitter_feed_url(url, platform_config)- Generates Nitter RSS URLsgenerate_arxiv_feed_url(url, platform_config)- Generates arXiv RSS/API URLs
These are automatically called during feed discovery.
Testing
Run the integration tests:
# All Twitter/arXiv tests
aiwebfeeds test file test_utils.py -k "twitter or arxiv"
# Specific test class
aiwebfeeds test file test_utils.py -k "TestTwitterIntegration"
aiwebfeeds test file test_utils.py -k "TestArxivIntegration"Usage Examples
Adding a Twitter Feed
Add to data/feeds.yaml:
- id: "your-twitter-feed"
site: "https://twitter.com/username"
title: "Feed Title"
topics: ["ai"]
source_type: "twitter"
platform_config:
platform: "twitter"Adding an arXiv Feed
Add to data/feeds.yaml:
- id: "your-arxiv-feed"
site: "https://arxiv.org/list/cs.LG/recent"
title: "Feed Title"
topics: ["research", "ml"]
source_type: "arxiv"
platform_config:
platform: "arxiv"Limitations
Twitter/X
- Relies on Nitter instances which may have rate limits or availability issues
- Nitter instances may be blocked or shut down
- Consider using multiple Nitter instances for redundancy
arXiv
- RSS feeds update once per day (overnight)
- API queries limited to 100 results maximum
- API has rate limiting (3 seconds between requests recommended)
- Author searches may return false positives for common names
Best Practices
- Twitter/X: Monitor your chosen Nitter instance for availability
- arXiv: Use specific categories rather than broad searches for better signal
- Both: Set appropriate
max_resultsto avoid overwhelming feeds - Both: Use
topic_weightsto indicate relevance when a feed covers multiple topics
Future Enhancements
Potential improvements:
- Automatic Nitter instance failover
- arXiv paper metadata enrichment
- Twitter thread reconstruction
- arXiv citation tracking
- Integration with arXiv vanity for better author disambiguation