birdclaw
v0.4.1
Published
Local Twitter memory in SQLite for archives, DMs, likes, bookmarks, and moderation
Readme
birdclaw 🪶 — Local Twitter memory in SQLite: archives, DMs, likes, bookmarks
birdclaw is a local-first Twitter workspace: archive import, cached live reads, focused triage, and reply flows in one local web app + CLI. Built by @steipete.
Status: WIP. Real and usable. Not done. Expect schema churn, transport gaps, and rough edges while the core settles.
What It Does
- keeps your Twitter data in local SQLite
- stores media and avatar cache under
~/.birdclaw - imports archives when you have them
- still works when you do not
- gives you a clean local UI for home, mentions, DMs, inbox, and blocks
- exposes scriptable JSON for agents and automation
What Works Today
Local data + storage
- one shared SQLite DB for multiple accounts, with canonical tweets/profiles and account-scoped timeline/collection edges
- FTS5 search over tweets and DMs
- archive autodiscovery on macOS
- archive import for tweets, likes, profiles, and full DMs
- archive import for bookmark exports when present
- live likes and bookmarks sync through
xurlorbird - Git-friendly text backups with yearly tweet shards and per-conversation DM shards
- profile hydration from live Twitter metadata
- profile-change history, affiliation badge edges, and extracted bio entities for local identity lookups
- local avatar cache
- local media cache root under
~/.birdclaw
Web UI
HometimelineMentionsqueueLikesandBookmarksreview lanesDMsworkspace with two-column layoutInboxfor mixed mention + DM triageBlocksfor local blocklist maintenance- constrained timeline lane instead of full-width dashboard UI
- tweet expansion with URLs, inline images, quoted tweets, replies, and profile hover cards
- sender bio and influence context in the DM detail header
- system / light / dark theme switcher with animated transition
Triage + filtering
- replied / unreplied filters for timelines
- DM filters by participant, followers, and derived influence score
- AI-ranked inbox for mentions + DMs
- OpenAI scoring hook for low-signal filtering
- cached live mentions export in
xurl-compatible JSON - liked/bookmarked tweet filters for archive and live-synced collections
- live profile-reply inspection for borderline AI/slop triage
- one-shot blocklist import from a file for batch moderation passes
Actions
- post tweets
- reply to tweets
- reply to DMs
- add / remove local blocks
- import batch blocklists in one call
- add / remove local mutes
- sync remote blocks through
xurlwhen available - fall back to the Twitter web cookie session when OAuth2 block writes are rejected
Safety
- local-first by default
- tests disable live writes
- CI disables live writes
- app has no auth layer because it is a local-only tool
Still In Progress
- broader resumable live sync beyond the targeted paths already wired
- fuller media fetch pipeline
- richer multi-account UX
- more complete transport coverage
- more archive edge-case handling
If you need polished product-grade sync parity today, this is not there yet.
Screens
Home: read and reply without fighting the main Twitter timelineMentions: work the reply queue with clean filtersLikes/Bookmarks: revisit saved posts from archive or live syncDMs: triage by sender context, follower count, and influenceInbox: let heuristics / OpenAI float likely-important itemsBlocks: maintain a local-first account-scoped blocklist
Storage
Default root:
~/.birdclawImportant paths:
- DB:
~/.birdclaw/birdclaw.sqlite - media cache:
~/.birdclaw/media - avatar cache:
~/.birdclaw/media/thumbs/avatars - Playwright test home:
.playwright-home
Override the root:
export BIRDCLAW_HOME=/path/to/custom/rootRequirements
- Node
25.8.1or Node 26.x pnpm- macOS recommended for Spotlight archive discovery
xurloptional for live reads / writesbirdoptional for cookie-backed likes, bookmarks, mentions, DMs, and write fallback- OpenAI API key optional for inbox scoring
Install
Homebrew:
brew install steipete/tap/birdclawFrom source:
fnm use
pnpm installRun
pnpm devOpen:
http://localhost:3000Quick Start
Initialize local state:
birdclaw init
birdclaw auth status --json
birdclaw db stats --jsonFind and import an archive:
birdclaw archive find --json
birdclaw import archive --json
birdclaw import archive ~/Downloads/twitter-archive-2025.zip --json
birdclaw import hydrate-profiles --jsonBack up the local SQLite store as canonical JSONL text:
birdclaw backup sync --repo ~/Projects/backup-birdclaw --remote https://github.com/steipete/backup-birdclaw.git --jsonMerge the backup into the current BIRDCLAW_HOME:
birdclaw backup import ~/Projects/backup-birdclaw --jsonStart the app:
birdclaw serveFirst moderation pass:
pnpm cli mentions export --mode xurl --refresh --all --max-pages 9 --limit 100
pnpm cli profiles replies @borderline_handle --limit 12 --json
pnpm cli blocks import ~/triage/blocklist.txt --account acct_primary --jsonCLI Highlights
Search local tweets
pnpm cli search tweets "local-first" --json
pnpm cli search tweets "sync engine" --limit 20 --json
pnpm cli search tweets --since 2020-01-01 --until 2021-01-01 --originals-only --hide-low-quality --limit 500 --json
pnpm cli search tweets --liked --limit 20 --json
pnpm cli search tweets --bookmarked --limit 20 --jsonSync likes, bookmarks, and home timeline
auto tries xurl first, then falls back to bird. Use bird directly when the API path is unavailable for the account/token you have locally.
pnpm cli sync likes --mode auto --limit 100 --refresh --json
pnpm cli sync bookmarks --mode auto --limit 100 --refresh --json
pnpm cli sync bookmarks --mode bird --all --max-pages 5 --limit 100 --refresh --json
pnpm cli sync timeline --limit 100 --refresh --json
pnpm cli sync mention-threads --limit 30 --delay-ms 1500 --timeout-ms 15000 --jsonExport mentions for agents
Default birdclaw mode returns normalized items with text, plainText, markdown, author metadata, and canonical URLs:
pnpm cli mentions export "agent" --unreplied --limit 10Cached live modes return xurl-compatible data/includes/meta, but stay in the local SQLite cache so repeat reads do not keep spending live calls:
pnpm cli mentions export --mode bird --limit 20
pnpm cli mentions export --mode bird --refresh --limit 20
pnpm cli mentions export --mode xurl --limit 5
pnpm cli mentions export --mode xurl --refresh --limit 5
pnpm cli mentions export --mode xurl --refresh --all --max-pages 9 --limit 100
pnpm cli mentions export "courtesy" --mode xurl --limit 5Home config lives in ~/.birdclaw/config.json. Example:
{
"actions": {
"transport": "auto"
},
"mentions": {
"dataSource": "bird",
"birdCommand": "/Users/steipete/Projects/bird/bird"
}
}Notes:
--refreshforces a live fetch--cache-ttl <seconds>tunes freshness--allwalks every retrievable mentions page;--max-pagescaps that scan- in paged
xurlmode,--limitis the per-page size mentions.dataSourcecontrols live mention reads onlyactions.transportcontrols live block/mute writes onlyactions.transportacceptsauto,bird, orxurlbirdmode uses your localbirdCLI and caches its mentions output into birdclaw's canonical store- filters still work in
xurlmode; filtered payloads are rebuilt from the local canonical store after sync sync likes,sync bookmarks,sync timeline, andsync mention-threadsstore live results in the canonical local store; per-account home/mention/like/bookmark membership is kept as edges so shared tweets do not clobber account ownership
Research bookmarks and threads
birdclaw research turns bookmarked tweets into a markdown brief with local thread expansion, live ancestor lookup when needed, and extracted links/handles:
birdclaw research "codex" --limit 20 --thread-depth 10 --json
birdclaw research --account acct_primary --out ~/research/codex.mdSearch and triage DMs
pnpm cli search dms "prototype" --json
pnpm cli search dms "layout" --min-followers 1000 --min-influence-score 120 --sort influence --json
pnpm cli search dms "blacksmith" --context 4 --resolve-profiles --expand-urls --no-xurl-fallback --json
pnpm cli whois "blacksmith guy" --context 4 --no-xurl-fallback --json
pnpm cli whois "github guy" --current-affiliation github --exclude-domain-only --no-xurl-fallback
pnpm cli whois "blacksmith" --tweets --context 4 --no-xurl-fallback --json
pnpm cli dms sync --limit 50 --refresh --json
pnpm cli dms list --refresh --limit 10 --json
pnpm cli dms list --unreplied --min-followers 500 --min-influence-score 90 --sort influence --json--resolve-profiles fills archive-imported numeric DM profiles through the local
cache first, then bird, then xurl unless --no-xurl-fallback is set.
Resolved profiles keep bio, location, profile URL, verification type, structured
URL entities, raw profile JSON, and any X affiliation badge metadata Birdclaw can
see. When a highlighted-label badge only gives a synthetic label plus handle,
Birdclaw tries to hydrate that org handle through bird and rewrites the edge to
the real local organization profile id. Profile changes are snapshotted over
time, and bios are indexed for @handles, domains, and company phrases.
whois uses that profile context plus DM context and cached URL expansion to
return typed evidence such as profile_bio, profile_url, profile_bio_url,
affiliation, bio_handle, bio_domain, bio_company, profile_history,
dm_context, and expanded_url. It keeps a derived identity_search_index
for fast local profile-evidence lookups, ranks current affiliation and bio
identity evidence above plain domains, and groups human output into likely
affiliated, ecosystem, link-only, and DM-context buckets. Use
--affiliation, --current-affiliation, and --exclude-domain-only when you
want "GitHub people" rather than anyone with a github.com link.
AI inbox
pnpm cli inbox --json
pnpm cli inbox --kind dms --limit 10 --json
pnpm cli inbox --score --hide-low-signal --limit 8 --jsonBlocklist
pnpm cli blocks list --account acct_primary --json
pnpm cli blocks sync --account acct_primary --json
pnpm cli blocks import ~/triage/blocklist.txt --account acct_primary --json
pnpm cli blocks add @amelia --account acct_primary --json
pnpm cli blocks record @amelia --account acct_primary --json
pnpm cli blocks remove @amelia --account acct_primary --json
pnpm cli ban @amelia --account acct_primary --transport auto --json
pnpm cli unban @amelia --account acct_primary --transport bird --jsonNotes:
ban/unbanaccept--transport auto|bird|xurlautotriesbirdfirst, then falls back toxurl, thenx-webcookie-backed block/unblock when both fail- forced
xurlwrites still verify throughbird statusbefore sqlite changes - Twitter still rejects pure OAuth2 block writes, so
autois the safe default for block/unblock blocks importaccepts newline-delimited blocklists with comments and markdown bulletsblocks syncis for slow/manual remote reconciliation; not for a hot cron loopblocks recordstores a known-good remote block locally without issuing another live write
Example blocklist file:
# crypto / AI slop
@jpctan
@SystemDaddyAi
- @Pepe202579 memecoin bait
https://x.com/someone/status/2030857479001960633?s=20Profile reply scan
pnpm cli profiles replies @jpctan --limit 12 --jsonNotes:
- for the "unsure if AI" case
- scans recent authored tweets, excludes retweets, keeps replies
- useful for spotting repeated generic praise, abstraction soup, or cross-thread templated cadence
Typical tell:
- same upbeat, generic reply shape across unrelated threads in a short time window
Mutes
pnpm cli mutes list --account acct_primary --json
pnpm cli mute @amelia --account acct_primary --transport xurl --json
pnpm cli mutes record @amelia --account acct_primary --json
pnpm cli unmute @amelia --account acct_primary --transport auto --jsonNotes:
mute/unmuteaccept--transport auto|bird|xurl- target profile resolution prefers
bird user --jsonbefore anyxurl /2/userslookup autotriesbirdfirst, then falls back toxurlwhen bird fails- forced
xurlwrites still verify throughbird statusbefore sqlite changes mutes recordstores a known-good remote mute locally without issuing another live write
Test env hardening
- Playwright strips inherited
--localstorage-filefromNODE_OPTIONSbefore starting Vite - this avoids cross-repo test warnings when another repo injected that flag
Compose / reply
pnpm cli compose post "Ship local software."
pnpm cli compose reply tweet_004 "On it."
pnpm cli compose dm dm_003 "Send it over."Text Backup
birdclaw backup export writes deterministic JSONL shards that can rebuild the local SQLite index without committing SQLite WAL/SHM files, FTS shadow tables, or transient live caches.
Layout:
manifest.json
data/accounts.jsonl
data/profiles.jsonl
data/profile_affiliations.jsonl
data/profile_snapshots.jsonl
data/profile_bio_entities.jsonl
data/tweets/YYYY.jsonl
data/tweets/unknown.jsonl
data/timeline_edges/home.jsonl
data/timeline_edges/mention.jsonl
data/collections/likes.jsonl
data/collections/bookmarks.jsonl
data/dms/conversations.jsonl
data/dms/YYYY.jsonl
data/moderation/blocks.jsonl
data/moderation/mutes.jsonlTweets are sharded by year for human browsing and yearly analysis. Collection-only tweets whose real creation date is unknown go into data/tweets/unknown.jsonl instead of pretending they belong to 1970. Timeline membership is stored in data/timeline_edges; likes and bookmarks are stored as account-scoped collection edges in data/collections. DMs are sharded by year with conversation_id in each row; this keeps Git fast while preserving conversation membership.
Use backup sync when the target is a private Git repo. It pulls first, merge-imports the remote backup into local SQLite, exports the local union back into text shards, commits, and pushes.
pnpm cli backup sync --repo ~/Projects/backup-birdclaw --remote https://github.com/steipete/backup-birdclaw.git --json
pnpm cli backup validate ~/Projects/backup-birdclaw --jsonConfigure stale-aware backup reads in ~/.birdclaw/config.json:
{
"backup": {
"repoPath": "/Users/steipete/Projects/backup-birdclaw",
"remote": "https://github.com/steipete/backup-birdclaw.git",
"autoSync": true,
"staleAfterSeconds": 900
}
}Read paths such as CLI search, inbox, API status/query, and web startup pull + merge from Git only when the last backup check is stale. Data-changing commands run a full backup sync afterward when this config is enabled. Set BIRDCLAW_BACKUP_AUTO_SYNC=0 to disable that behavior for one process.
Scheduled Bookmark Sync
birdclaw jobs sync-bookmarks refreshes live bookmarks and appends one JSONL audit entry per run. Each entry includes host, timestamps, duration, before/after bookmark counts, source transport, fetched count, backup sync result, and any error.
birdclaw --json jobs sync-bookmarks --mode auto --limit 100 --max-pages 5 --refresh
tail -n 5 ~/.birdclaw/audit/bookmarks-sync.jsonl | jq .After a successful bookmark refresh, the job runs the normal backup auto-sync path. If ~/.birdclaw/config.json has backup.autoSync enabled, the changed local data is merged into the configured Git backup repo, committed, and pushed. The audit entry records that backup result so scheduled runs are inspectable later.
On macOS, install the 3-hour LaunchAgent after choosing the Birdclaw executable path for that machine:
birdclaw --json jobs install-bookmarks-launchd --program /opt/homebrew/bin/birdclawIf the machine uses bird with browser cookies that are not available to launchd, write an export-only env file with mode 0600 and install with --env-file ~/.config/bird/env.sh. Birdclaw sources that file inside the scheduled process without storing the secrets in the plist.
The LaunchAgent writes ~/Library/LaunchAgents/com.steipete.birdclaw.bookmarks-sync.plist, runs at load, then every 10,800 seconds. It writes the audit log to ~/.birdclaw/audit/bookmarks-sync.jsonl and stdout/stderr to ~/.birdclaw/logs/bookmarks-sync.*.log. A lock file prevents overlapping runs and records an already-running skip when needed. The default job fetches up to 5 pages every 3 hours; pass --all if you want every retrievable page each run.
Useful checks:
launchctl print gui/$(id -u)/com.steipete.birdclaw.bookmarks-sync
launchctl kickstart -k gui/$(id -u)/com.steipete.birdclaw.bookmarks-sync
tail -n 1 ~/.birdclaw/audit/bookmarks-sync.jsonl | jq .Typical Workflow
- import your archive if you have one
- hydrate imported profiles from live Twitter metadata
- use
Homefor reading - use
Mentionsfor reply triage - when one account feels borderline, inspect
profiles replies - collect keepers into a blocklist file and run
blocks import - use
DMsfor high-context conversation work - use
Inboxwhen you want AI help cutting noise - use CLI exports when agents need stable JSON
Live Transport
Current preference:
xurlfirstbirdfallback for surfaces where cookie-backed reads work better
Without xurl or bird, birdclaw still works in local/archive mode.
Check transport:
pnpm cli auth status --jsonArchitecture
- SQLite is the canonical local truth
- archive import and live transport should converge on the same model
- CLI and web UI share the same normalized core
- AI ranking is layered on top of local data, not the source of truth
Testing
pnpm check
pnpm test
pnpm coverage
pnpm build
pnpm e2eCurrent bar:
- branch coverage above
80% - Playwright coverage for core UI flows
CI
GitHub Actions runs:
pnpm checkpnpm coveragepnpm buildpnpm e2e
Workflow: ci.yml
