npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

que-mobile-sdk

v0.1.1

Published

AI-powered Android automation framework for React Native/Expo with SENSE → THINK → ACT loop. Control mobile apps with natural language using Gemini AI and Accessibility Services.

Readme

QUE Mobile SDK

AI-powered Android automation framework for React Native/Expo with SENSE → THINK → ACT loop.

npm version License: MIT

Overview

QUE Mobile SDK enables AI-powered mobile app control through natural language. The system captures screen state via Android Accessibility Services, processes it through an LLM (Gemini), and executes actions programmatically.

Key Features

  • 🤖 AI-Powered Automation: Control Android apps with natural language commands
  • 🔄 SENSE → THINK → ACT Loop: Autonomous multi-step task execution
  • ⚛️ React Native/Expo: Clean TypeScript API with React hooks
  • 🎯 19+ Action Types: Tap, type, swipe, voice, file operations, and more
  • 🗣️ Voice Integration: Built-in text-to-speech and speech-to-text
  • 📁 File System: Sandboxed storage for task tracking and results
  • Trigger System: Scheduled and notification-based automated task execution
  • 🐛 Debug Mode: Visual feedback and overlay for development
  • 🔌 Expo Plugin: Zero-config setup for managed workflow

Installation

npm install que-mobile-sdk

Or with yarn:

yarn add que-mobile-sdk

Peer Dependencies

Ensure you have these installed:

npm install expo react react-native

Quick Start

1. Configure Expo Plugin

Add the plugin to your app.json:

{
  "expo": {
    "plugins": [
      "que-mobile-sdk"
    ]
  }
}

This automatically configures Android permissions and Accessibility Service.

2. Rebuild Native Code

npx expo prebuild --clean
npx expo run:android

3. Enable Accessibility Service

On your Android device:

  1. Go to Settings → Accessibility
  2. Find QUE Accessibility Service
  3. Enable it

4. Use in Your App

import { useAgent } from 'que-mobile-sdk';

function App() {
  const { execute, isRunning, result } = useAgent({
    apiKey: 'YOUR_GEMINI_API_KEY',
    maxSteps: 100,
  });

  const handleTask = async () => {
    await execute("Open Instagram and like the first post");
  };

  return (
    <View>
      <Button 
        title={isRunning ? "Running..." : "Execute Task"} 
        onPress={handleTask}
        disabled={isRunning}
      />
      {result && <Text>{result.message}</Text>}
    </View>
  );
}

API Reference

Hooks

useAgent(config: AgentConfig)

Main hook for executing AI-powered tasks.

Parameters:

  • config.apiKey (string, required): Gemini API key
  • config.maxSteps (number, optional): Maximum steps per task (default: 100)
  • config.maxFailures (number, optional): Maximum consecutive failures (default: 3)
  • config.debugMode (boolean, optional): Enable debug visualization (default: false)
  • config.onStep (function, optional): Callback for each step
  • config.onComplete (function, optional): Callback on task completion

Returns:

  • execute(task: string): Execute a task
  • isRunning (boolean): Whether agent is currently running
  • result (AgentResult | null): Task result
  • error (string | null): Error message if failed
  • stop(): Stop execution
  • history (AgentHistory[]): Execution history

Example:

const { execute, isRunning, result, error, stop } = useAgent({
  apiKey: process.env.GEMINI_API_KEY,
  maxSteps: 50,
  debugMode: true,
  onStep: (step) => console.log(`Step ${step.stepNumber}:`, step.action),
});

// Execute task
await execute("Search for 'React Native' on Google");

// Stop if needed
if (isRunning) {
  stop();
}

useVoice()

Hook for voice interactions.

Returns:

  • speak(text: string): Convert text to speech
  • startListening(): Start speech recognition
  • stopListening(): Stop speech recognition
  • isListening (boolean): Whether currently listening
  • isSpeaking (boolean): Whether currently speaking

Example:

const { speak, startListening, isListening } = useVoice();

// Speak
await speak("Hello, how can I help you?");

// Listen
const userInput = await startListening();
console.log("User said:", userInput);

useFileSystem()

Hook for file operations.

Returns:

  • writeFile(name: string, content: string): Write file
  • readFile(name: string): Read file
  • listFiles(): List all files

Example:

const { writeFile, readFile, listFiles } = useFileSystem();

// Write
await writeFile("notes.md", "# My Notes\n\nTask completed!");

// Read
const content = await readFile("notes.md");

// List
const files = await listFiles();

Components

<AgentProvider>

Context provider for sharing agent configuration.

<AgentProvider config={{ apiKey: 'YOUR_KEY' }}>
  <App />
</AgentProvider>

<AgentButton>

Quick action button component.

<AgentButton 
  task="Open Settings" 
  onComplete={(result) => console.log(result)}
/>

<DebugOverlay>

Visual debug overlay (only renders when debugMode: true).

<DebugOverlay 
  step={5}
  maxSteps={100}
  lastAction="tap_element"
  elementCount={42}
  reasoning="Tapping on the search button..."
/>

Action Types

The SDK supports 19+ action types:

| Action | Description | Example | |--------|-------------|---------| | tap_element | Tap on UI element | { type: 'tap_element', elementId: 5 } | | long_press_element | Long press element | { type: 'long_press_element', elementId: 5 } | | type | Type text | { type: 'type', text: 'Hello' } | | swipe_down | Swipe down | { type: 'swipe_down', amount: 500 } | | swipe_up | Swipe up | { type: 'swipe_up', amount: 500 } | | back | Press back button | { type: 'back' } | | home | Press home button | { type: 'home' } | | switch_app | Open app switcher | { type: 'switch_app' } | | wait | Wait 1 second | { type: 'wait' } | | open_app | Open app by name | { type: 'open_app', appName: 'Instagram' } | | search_google | Search on Google | { type: 'search_google', query: 'React Native' } | | speak | Text-to-speech | { type: 'speak', message: 'Hello' } | | ask | Ask user via voice | { type: 'ask', question: 'What is your name?' } | | write_file | Write file | { type: 'write_file', fileName: 'notes.md', content: '...' } | | append_file | Append to file | { type: 'append_file', fileName: 'notes.md', content: '...' } | | read_file | Read file | { type: 'read_file', fileName: 'notes.md' } | | done | Complete task | { type: 'done', success: true, text: 'Task completed!' } |

See API Documentation for complete action reference.

Configuration Options

AgentConfig

interface AgentConfig {
  // Required
  apiKey: string;                    // Gemini API key
  
  // Optional
  maxSteps?: number;                 // Max steps per task (default: 100)
  maxFailures?: number;              // Max consecutive failures (default: 3)
  debugMode?: boolean;               // Enable debug mode (default: false)
  model?: string;                    // Gemini model (default: 'gemini-2.0-flash-exp')
  
  // Callbacks
  onStep?: (step: AgentStep) => void;
  onComplete?: (result: AgentResult) => void;
  onError?: (error: Error) => void;
}

Debug Mode

Enable visual feedback during development:

const { execute } = useAgent({
  apiKey: 'YOUR_KEY',
  debugMode: true,  // Enable debug mode
});

Debug mode provides:

  • 🟢 Green boxes around clickable elements
  • 🟡 Yellow boxes around text elements
  • 🔴 Red circles at tap locations
  • ⚡ White border flash on screen changes
  • 📊 Debug overlay with step info

See Debug Mode Guide for details.

Usage Guides

Basic Task Automation

import { useAgent } from 'que-mobile-sdk';

function SimpleTask() {
  const { execute, isRunning, result } = useAgent({
    apiKey: process.env.GEMINI_API_KEY,
  });

  return (
    <Button
      title="Open Settings"
      onPress={() => execute("Open Settings app")}
      disabled={isRunning}
    />
  );
}

Voice Integration

import { useAgent, useVoice } from 'que-mobile-sdk';

function VoiceCommand() {
  const { execute } = useAgent({ apiKey: 'YOUR_KEY' });
  const { startListening, speak } = useVoice();

  const handleVoiceCommand = async () => {
    await speak("What would you like me to do?");
    const command = await startListening();
    await execute(command);
  };

  return (
    <Button title="Voice Command" onPress={handleVoiceCommand} />
  );
}

File Operations

import { useAgent } from 'que-mobile-sdk';

function FileTask() {
  const { execute } = useAgent({ apiKey: 'YOUR_KEY' });

  const task = `
    1. Create a file called shopping.md
    2. Write "Milk, Eggs, Bread" to it
    3. Read the file back to confirm
  `;

  return (
    <Button title="File Task" onPress={() => execute(task)} />
  );
}

Multi-Step Tasks

const task = `
  1. Open Instagram
  2. Search for "reactnative"
  3. Like the first 3 posts
  4. Return to home screen
`;

await execute(task);

Automated Triggers

import { useTriggers, TriggerManager } from 'que-mobile-sdk';

function ScheduledTask() {
  const { createTrigger } = useTriggers();

  const handleCreateTrigger = async () => {
    await createTrigger({
      id: TriggerManager.generateTriggerId(),
      type: 'schedule',
      enabled: true,
      priority: 10,
      task: 'Open Calendar and speak today\'s events',
      agentConfig: { apiKey: 'YOUR_KEY', maxSteps: 50 },
      schedule: {
        time: '08:00',
        daysOfWeek: [1, 2, 3, 4, 5], // Weekdays
      },
    });
  };

  return (
    <Button title="Schedule Morning Briefing" onPress={handleCreateTrigger} />
  );
}

See Usage Guides and Trigger System Guide for more examples.

Troubleshooting

Accessibility Service Not Enabled

Problem: Agent fails with "Accessibility Service not enabled"

Solution:

  1. Go to Settings → Accessibility
  2. Find QUE Accessibility Service
  3. Enable it
  4. Restart your app

API Key Invalid

Problem: LLM errors with "Invalid API key"

Solution:

  • Get a Gemini API key from Google AI Studio
  • Ensure the key is correctly set in AgentConfig
  • Don't commit API keys to version control

Actions Not Executing

Problem: Agent sees elements but actions fail

Solution:

  • Enable debug mode to visualize what the agent sees
  • Check that Accessibility Service has all required permissions
  • Ensure UI elements are actually clickable (not disabled)
  • Try increasing wait times between actions

Max Steps Exceeded

Problem: Task stops with "Max steps exceeded"

Solution:

  • Increase maxSteps in config
  • Break complex tasks into smaller subtasks
  • Check if the task is actually achievable

Element Not Found

Problem: Agent reports "Element not found"

Solution:

  • Enable debug mode to see element IDs
  • Ensure the screen has loaded completely
  • Add wait actions before interacting with elements
  • Check if the element is scrolled off-screen

Performance Issues

Problem: Agent is slow or times out

Solution:

  • Reduce maxSteps for simpler tasks
  • Use faster Gemini model if available
  • Ensure good network connection
  • Close other apps to free memory

Debug Mode Not Working

Problem: Debug overlay not showing

Solution:

  • Ensure debugMode: true in config
  • Check that <DebugOverlay> component is rendered
  • Verify Accessibility Service is enabled
  • Restart the app

Examples

Check out the example app for complete working examples:

  • Simple Task: Basic task execution
  • Voice Command: Voice-controlled automation
  • File Operations: File system integration
  • Debug Mode: Visual debugging
  • Multi-Step: Complex multi-step tasks

Run the example:

cd example
npm install
npx expo prebuild
npx expo run:android

Architecture

┌─────────────────────────────────────────────────┐
│         React Native Application Layer          │
│  (useAgent, useVoice, useFileSystem hooks)      │
└─────────────────────────────────────────────────┘
                      ↓
┌─────────────────────────────────────────────────┐
│           TypeScript Core Layer                  │
│  Agent, MemoryManager, Perception, LLM Client   │
└─────────────────────────────────────────────────┘
                      ↓
┌─────────────────────────────────────────────────┐
│         Native Bridge Layer (Kotlin)             │
│     AccessibilityModule (NativeModule)          │
└─────────────────────────────────────────────────┘
                      ↓
┌─────────────────────────────────────────────────┐
│      Android Accessibility Service               │
│   (UI Hierarchy, Touch, Gestures, System)       │
└─────────────────────────────────────────────────┘

SENSE → THINK → ACT Loop

  1. SENSE: Capture screen state via Accessibility Service
  2. THINK: Process with Gemini LLM to decide next action
  3. ACT: Execute action and observe result
  4. Repeat until task complete or max steps reached

Requirements

  • Platform: Android only (iOS support planned)
  • React Native: >= 0.72.0
  • Expo: >= 50.0.0
  • Android: >= API 24 (Android 7.0)
  • Node: >= 18.0.0

Contributing

Contributions are welcome! Please read our Contributing Guide for details.

License

MIT © Sarath Babu

Links

Support


Made with ❤️ by the QUE team