#7

Social Sentinel

January 7, 2026

PythonFastAPIPostgreSQLGroqReactWebSockets

Tech/AI news aggregator and real-time social media dashboard. Pulls from 60+ RSS sources with sentiment analysis, AI-generated briefings, live pulse charts, and trend tracking.

What is it?

A real-time AI news intelligence dashboard aggregating 60+ tech/AI RSS feeds, 19 Reddit subreddits, Twitter/X (without an API key), and podcast/YouTube channels. Per-tweet sentiment analysis with Groq, executive briefings with Gemini, live WebSocket push to the browser.

Twitter without an API key

No Twitter API key is used. The twikit library authenticates via stored cookies (cookies.json) and sends GraphQL requests directly to x.com/i/api/graphql/<queryId>/SearchTimeline — the same endpoint the Twitter web app uses.

The scraper patches twikit at runtime because Twitter's frontend changes regularly: it regex-extracts the current asset hash from the homepage (ondemand.s.*a.js), extracts key byte indices from that JS file, and resolves the current SearchTimeline GraphQL query ID from main.js. When twikit's parser fails, a custom GraphQL parser takes over.

Rate limit handling: 10-20s random delays between requests, max 60 requests/hour, 15-minute backoff on 429.

The SQLite → PostgreSQL migration

The project started with SQLite (social_sentinel.db is still in the repo). As data volume grew and Cloud Run's ephemeral filesystem made persistence unreliable, the system migrated to PostgreSQL on Cloud SQL.

Migration artifacts: scripts/migrate_to_pg.py (3-table migration: articles, tweets, media), scripts/init_postgres_schema.py, scripts/install_postgres_local.sh for local dev, and SETUP_SQL.md documenting the whole process. The SQLite file is kept as a snapshot.

Data collection and the API key rotation

48 RSS feeds + 19 Reddit subreddits fetched with feedparser. URL deduplication by hashing the article link. 4-8 collection runs per day via cron, logged in logs/cron_collect_*.log.

For AI: Groq (llama-3.3-70b-versatile) does per-tweet sentiment analysis — positive/negative/neutral with a confidence score. Gemini 2.5 Flash generates executive briefings from headline batches. Gemini uses 4 rotating API keys to stay within per-key rate limits.

WebSocket fans out new articles immediately to all connected browser sessions via FastAPI's WebSocket support — no polling, push-only.

Deployment

Cloud Run, 8Gi RAM (PostgreSQL driver + twikit + ML libs are memory-hungry). Frontend on Netlify at social-sentinel.netlify.app with custom domain social.cbproforge.com. The 8Gi allocation is notably higher than other projects — Twitter scraping with in-process parsing and keeping LLM context in memory adds up.

Key takeaways

  • Twitter scraping without API: cookie auth, GraphQL endpoint discovery, runtime patching for frontend changes
  • SQLite to PostgreSQL migration: when to upgrade, migration scripts, Cloud Run persistence constraints
  • Multiple Gemini API key rotation for rate limit management
  • feedparser for RSS/Atom: handling malformed feeds, deduplication by URL hash
  • WebSocket fan-out in FastAPI: connection management, push on new data
Try it live →Watch on YouTube →← all projects