npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

movicom

v0.3.0

Published

Give an AI agent a body on a phone — drive a real Android over adb. Reads the screen as a menu designed for the model (cheap), not screenshots. Light enough for a 9B local model.

Readme

movicom

Give an AI agent a body on a phone. A tiny, dependency-free Node.js driver that lets any LLM see and use a real Android device over adb — reading the screen as a menu designed for the model (cheap) instead of screenshots (expensive), and acting by name instead of pixel coordinates. Light enough that a small 9B local model can drive it.

movicom web search "world cup 2026 first match"
movicom ui see
# → {"where":"chrome",
#    "text":["The 2026 World Cup opens Thu June 11, 2026 — Mexico vs South Africa,
#             Estadio Azteca, Mexico City"],
#    "actions":["Images","Maps","News"], "fields":["Search"],
#    "can_scroll":true, "page":"1/8",
#    "hint":"tap an action: ui tap \"Images\"  |  more actions: ui more"}

No screenshot. No API. No browser extension. movicom read the answer off the phone as text — the way a person would.

Building an agent on movicom? Read AGENTS.md (the operating manual for LLMs) and HOWTO.md (set up adb + a phone in minutes).

Named after Movicom, the pioneering Argentine cellular company — a local telecom ghost reborn as an agent's hands.


Why

Most "let an agent use a phone" setups send a screenshot to a vision model on every step — slow, and it burns real tokens (and dollars) per image.

movicom reads Android's own UI tree (uiautomator dump) and hands the agent a menuwhere it is, the actions it can tap, the fields it can fill, the visible text, and a hint of what to do next. The model picks from a menu like a human browses a UI; it never reasons about pixels.

| Approach | Cost per screen | Agent reasons about | |---|---|---| | Screenshot → vision | ~1,000–1,500 tok + $ per image | pixels | | movicom menu | ~200–400 tok, no image $ | labels & structure |

This is the AI Interface / AI Experience (AII/AIX): the output isn't a raw dump, it's an interface designed for an AI to use. Cluttered pages are paginated (page:"1/8", ui more) so a noisy screen stays cheap — a Google results page went from ~1,350 tokens to ~270 with no loss of reach. Screenshots remain an explicit fallback (shot(), plus camera shot to take a real photo) for the rare screen with no text.

Why a phone, not an API?

Because most of what a person does on a phone has no API — your Instagram feed, a Rappi order, a logged-in dashboard, an app whose API got killed or gated. The screen is always there; it's the one surface that can't be walled off without walling off the user. movicom drives it as you, on your device, with your accounts — for your own work. (When a clean API exists — e.g. weather — use it; movicom is for the 99% that doesn't.)

Design

  AGENT  (the LLM)            decides WHAT and WHY — by name, never coordinates
    │  intentions
  movicom  (this file)        the "optic nerve": dump XML → minified meaning;
    │  adb commands            resolve names → coords; owns the see→act loop
  DEVICE  (Android over adb)  emulator or a real phone — swappable, same code

Everything is adb. No app to install on the phone, no agent process running on the device, no root required for the core. The device is swappable: an emulator and a real phone are the same to the agent.

Install

Requires adb (Android Platform Tools) and Node.js 14+. No npm dependencies.

npm install -g movicom
adb version && movicom doctor

New here? Read HOWTO.md — a step-by-step setup guide: install adb + movicom (Mac & Windows), prepare the Android phone (developer mode, USB or wireless debugging), recommended phone settings, and connect/verify. Covers real devices and the emulator.

Usage

Grammar: movicom <noun> <verb> [arg|json]. Every command prints one JSON value.

movicom doctor                       # where am I? device + foreground app
movicom web search "best ramen near me"   # reach the web (don't fumble the omnibox)
movicom app open gmail               # launch an app by name
movicom ui see                       # read the screen as a menu
movicom ui tap "Compose"             # act by NAME (movicom holds the coords)
movicom ui fill '{"Subject":"Hi","Compose email":"the body"}'
movicom ui more                      # next page of actions on a busy screen
movicom camera shot '{"pull":true}'  # take a real photo, copy it to the computer

Verbs

| Verb | What it does | |---|---| | doctor / devices | Device + foreground app / list adb devices. Start here. | | web open <url> · web go <domain> · web search <query> | Reach the internet deterministically via an intent — no address-bar fumbling. | | app list · app open <name> · app intent '{...}' | List / launch apps; fire a raw intent. | | ui see [page#] | Read the screen as a menu: {where, actions[], fields[], text[], can_scroll, page, hint}. | | ui more | Next page of actions (busy screens are paginated to stay cheap). | | ui tap "<label>" | Tap the element matching label (resolves across all pages). | | ui type "<text>" · ui fill '{field: value}' | Type into the focused field / fill a multi-field form (focuses each field first). | | ui key <BACK\|HOME\|ENTER\|…> · ui scroll <dir> · ui back · ui home | Keys, swipes, navigation. | | kbd off / kbd on | Disable/enable the soft keyboard — stops layout shift so forms fill reliably. | | contacts list\|find\|add · notif list | System lane: talk to the OS, not the glass. | | camera shot '{"pull":true}' | Take a real photo; pull copies it back so a multimodal model can SEE it. | | ui shot [file] | Low-res screenshot — explicit fallback for text-less screens. | | workflow add\|run\|list\|del | Save & replay named command sequences (shareable macros). |

Every action (ui tap/type/key/scroll/fill) returns {<result>, screen:{...}} — the fresh menu after the action — so the model doesn't need a separate ui see.

Configuring the phone

See HOWTO.md for the full setup: install adb + movicom (Mac & Windows), enable developer mode, and connect a phone over USB, wireless (Android 11+, no cable), or the emulator. Quick check:

adb devices      # your device should be listed
movicom doctor   # device + current foreground app

Status

Early but real, and dogfooded hard. Proven on an Android emulator + real Android: reading screens as a menu, filling multi-field forms, sending an email through the Gmail app, taking a photo, reading live web answers, writing a contact — all verified against ground truth (the MediaStore / content provider / a received email), not just the screen. It's UI-driven, so it can break when an app redesigns — that's the trade for reaching apps that have no API. Built in the open.

Contributions welcome. Found a screen movicom mis-reads? Run movicom ui see --raw on it and open an issue with the XML — the parser learns from real screens.

License

MIT © Andy Cufari. See LICENSE.