react-native-agentic-ai
v0.8.2
Published
Build autonomous AI agents for React Native and Expo apps. Provides AI-native UI traversal, tool calling, and structured reasoning.
Downloads
6,144
Maintainers
Readme
Agentic AI for React Native
Add an autonomous AI agent to any React Native app — no rewrite needed. Wrap your app with
<AIAgent>and get: natural language UI control, real-time voice conversations, and a built-in knowledge base. Fully customizable, production-grade security, performant, and lightweight. Plus: an MCP bridge that lets any AI connect to and test your app.
Two names, one package — pick whichever you prefer:
npm install @mobileai/react-native
# — or —
npm install react-native-agentic-ai🤖 AI Agent — Autonomous UI Control
🧪 AI-Powered Testing — Test Your App in English, Not Code
Google Antigravity running 5 checks on the emulator and finding 5 real bugs — zero test code, zero selectors, just English.
Two names, one package — install either: @mobileai/react-native or react-native-agentic-ai
⭐ If this helped you, star this repo — it helps others find it!
🧠 How It Works — Structure-First Agentic AI
What if your AI could understand your app the way a real user does — not by looking at pixels, but by reading the actual UI structure?
That's what this SDK does. It reads your app's live UI natively — every button, label, input, and screen — in real time. The AI understands your app's structure, not a screenshot of it.
No OCR. No image pipelines. No selectors. No annotations. No view wrappers.
The result: an AI that truly understands your app — and can act on it autonomously.
| | This SDK | Screenshot-based AI | Build It Yourself |
|---|---|---|---|
| Setup | <AIAgent> — one wrapper | Vision model + custom pipeline | Months of custom code |
| How it reads UI | Native structure — real time | Screenshot → OCR | Custom integration |
| AI agent loop | ✅ Built-in multi-step | ❌ Build from scratch | ❌ Build from scratch |
| Voice mode | ✅ Real-time bidirectional | ❌ | ❌ |
| Custom business logic | ✅ useAction hook | Custom code | Custom code |
| MCP bridge (any AI connects) | ✅ One command | ❌ | ❌ |
| Knowledge base | ✅ Built-in retrieval | ❌ | ❌ |
✨ What's Inside
Ship to Production
🤖 Autonomous AI Agent — Natural Language UI Automation
Your users describe what they want in natural language. The SDK reads the live screen, plans a sequence of actions, and executes them end-to-end — tapping buttons, filling forms, navigating screens — all autonomously. Powered by Gemini. OpenAI is also supported as a text mode alternative.
- Zero-config — wrap your app with
<AIAgent>, done. No annotations, no selectors - Multi-step reasoning — navigates across screens to complete complex tasks
- Custom actions — expose any business logic (checkout, API calls, mutations) via
useAction - Knowledge base — AI queries your FAQs, policies, product data on demand
- Human-in-the-loop — native
Alert.alertconfirmation before critical actions
🎤 Real-time Voice AI Agent — Bidirectional Audio with Gemini Live API
Full bidirectional voice AI powered by the Gemini Live API (Gemini only). Users speak naturally; the agent responds with voice AND controls your app simultaneously.
- Sub-second latency — real-time audio via WebSockets, not turn-based
- Full UI control — same tap, type, navigate, custom actions as text mode — all by voice
- Screen-aware — auto-detects screen changes and updates its context instantly
💡 Speech-to-text in text mode: Install
expo-speech-recognitionand a mic button appears in the chat bar — letting users dictate messages instead of typing. This is separate from voice mode.
Supercharge Your Dev Workflow
🔌 MCP Bridge — Connect Any AI to Your App
Your app becomes MCP-compatible with one prop. Any AI that speaks the Model Context Protocol — editors, autonomous agents, CI/CD pipelines, custom scripts — can remotely read and control your app.
The MCP bridge uses the same AgentRuntime that powers the in-app AI agent. If the agent can do it via chat, an external AI can do it via MCP.
MCP-only mode — just want testing? No chat popup needed:
<AIAgent
showChatBar={false}
mcpServerUrl="ws://localhost:3101"
apiKey="YOUR_KEY"
navRef={navRef}
>
<App />
</AIAgent>🧪 AI-Powered Testing via MCP
The most powerful use case: test your app without writing test code. Connect your AI (Antigravity, Claude Desktop, or any MCP client) to the emulator and describe what to check — in English. No selectors to maintain, no flaky tests, self-healing by design.
Skip the test framework. Just ask:
Ad-hoc — ask your AI anything about the running app:
"Is the Laptop Stand price consistent between the home screen and the product detail page?"
YAML Test Plans — commit reusable checks to your repo:
# tests/smoke.yaml
checks:
- id: price-sync
check: "Read the Laptop Stand price on home, tap it, compare with detail page"
- id: profile-email
check: "Go to Profile tab. Is the email displayed under the user's name?"Then tell your AI: "Read tests/smoke.yaml and run each check on the emulator"
Real Results — 5 bugs found autonomously:
| # | What was checked | Bug found | AI steps | |---|---|---|---| | 1 | Price consistency (list → detail) | Laptop Stand: $45.99 vs $49.99 | 2 | | 2 | Profile completeness | Email missing — only name shown | 2 | | 3 | Settings navigation | Help Center missing from Support section | 2 | | 4 | Description vs specifications | "breathable mesh" vs "Leather Upper" | 3 | | 5 | Cross-screen price sync | Yoga Mat: $39.99 vs $34.99 | 4 |
📦 Installation
Two names, one package — pick whichever you prefer:
npm install @mobileai/react-native
# — or —
npm install react-native-agentic-aiNo native modules required by default. Works with Expo managed workflow out of the box — no eject needed.
Optional Dependencies
npx expo install react-native-view-shotnpx expo install expo-speech-recognitionAutomatically detected. No extra config needed — a mic icon appears in the text chat bar, letting users speak their message instead of typing. This is separate from voice mode.
npm install react-native-audio-apiExpo Managed — add to app.json:
{
"expo": {
"android": { "permissions": ["RECORD_AUDIO", "MODIFY_AUDIO_SETTINGS"] },
"ios": { "infoPlist": { "NSMicrophoneUsageDescription": "Required for voice chat with AI assistant" } }
}
}Then rebuild: npx expo prebuild && npx expo run:android (or run:ios)
Expo Bare / React Native CLI — add RECORD_AUDIO + MODIFY_AUDIO_SETTINGS to AndroidManifest.xml and NSMicrophoneUsageDescription to Info.plist, then rebuild.
Hardware echo cancellation (AEC) is automatically enabled — no extra setup.
🚀 Quick Start
1. Enable Screen Mapping (optional, recommended)
Add one line to your metro.config.js — the AI gets a map of every screen in your app, auto-generated on each dev start:
// metro.config.js
require('@mobileai/react-native/generate-map').autoGenerate(__dirname);Or generate it manually anytime:
npx @mobileai/react-native generate-mapWithout this, the AI can only see the currently mounted screen — it has no idea what other screens exist or how to reach them. Example: "Write a review for the Laptop Stand" — the AI sees the Home screen but doesn't know a
WriteReviewscreen exists 3 levels deep. With a map, it sees every screen in your app and knows exactly how to get there:Home → Products → Detail → Reviews → WriteReview.
2. Wrap Your App
React Navigation
import { AIAgent } from '@mobileai/react-native'; // or 'react-native-agentic-ai'
import { NavigationContainer, useNavigationContainerRef } from '@react-navigation/native';
import screenMap from './ai-screen-map.json'; // auto-generated by step 1
export default function App() {
const navRef = useNavigationContainerRef();
return (
<AIAgent
// ⚠️ Prototyping ONLY — don't ship API keys in production
apiKey="YOUR_API_KEY"
// ✅ Production: route through your secure backend proxy
// proxyUrl="https://api.yourdomain.com/ai-proxy"
// proxyHeaders={{ Authorization: `Bearer ${userToken}` }}
navRef={navRef}
screenMap={screenMap} // optional but recommended
>
<NavigationContainer ref={navRef}>
{/* Your existing screens — zero changes needed */}
</NavigationContainer>
</AIAgent>
);
}Expo Router
In your root layout (app/_layout.tsx):
import { AIAgent } from '@mobileai/react-native'; // or 'react-native-agentic-ai'
import { Slot, useNavigationContainerRef } from 'expo-router';
import screenMap from './ai-screen-map.json'; // auto-generated by step 1
export default function RootLayout() {
const navRef = useNavigationContainerRef();
return (
<AIAgent
apiKey={process.env.AI_API_KEY!}
navRef={navRef}
screenMap={screenMap}
>
<Slot />
</AIAgent>
);
}Choose Your Provider
The examples above use Gemini (default). To use OpenAI for text mode, add the provider prop. Voice mode is not supported with OpenAI.
<AIAgent
provider="openai"
apiKey="YOUR_OPENAI_API_KEY"
// model="gpt-4.1-mini" ← default, or use any OpenAI model
navRef={navRef}
>
{/* Same app, different brain */}
</AIAgent>A floating chat bar appears automatically. Ask the AI to navigate, tap buttons, fill forms, answer questions.
Knowledge-Only Mode — AI Assistant Without UI Automation
Set enableUIControl={false} for a lightweight FAQ / support assistant. Single LLM call, ~70% fewer tokens:
<AIAgent enableUIControl={false} knowledgeBase={KNOWLEDGE} />| | Full Agent (default) | Knowledge-Only | |---|---|---| | UI analysis | ✅ Full structure read | ❌ Skipped | | Tokens per request | ~500-2000 | ~200 | | Agent loop | Up to 25 steps | Single call | | Tools available | 7 | 2 (done, query_knowledge) |
🗺️ Screen Mapping — Navigation Intelligence
By default, the AI navigates by reading what's on screen and tapping visible elements. Screen mapping gives the AI a complete map of every screen and how they connect — via static analysis of your source code (AST). No API key needed, runs in ~2 seconds.
Setup (one line)
Add to your metro.config.js — the screen map auto-generates every time Metro starts:
// metro.config.js
require('@mobileai/react-native/generate-map').autoGenerate(__dirname);
// ... rest of your Metro configThen pass the generated map to <AIAgent>:
import screenMap from './ai-screen-map.json';
<AIAgent screenMap={screenMap} navRef={navRef}>
<App />
</AIAgent>That's it. Works with both Expo Router and React Navigation — auto-detected.
What It Gives the AI
| Without Screen Map | With Screen Map |
|---|---|
| AI sees only the current screen | AI knows every screen in your app |
| Must explore to find features | Plans the full navigation path upfront |
| Deep screens may be unreachable | Knows each screen's navigatesTo links |
| No knowledge of dynamic routes | Understands item/[id], category/[id] patterns |
Disable Without Removing
<AIAgent screenMap={screenMap} useScreenMap={false} />Manual generation:
npx @mobileai/react-native generate-mapWatch mode — auto-regenerates on file changes:
npx @mobileai/react-native generate-map --watchnpm scripts — auto-run before start/build:
{
"scripts": {
"generate-map": "npx @mobileai/react-native generate-map",
"prestart": "npm run generate-map",
"prebuild": "npm run generate-map"
}
}| Flag | Description |
|------|-------------|
| --watch, -w | Watch for file changes and auto-regenerate |
| --dir=./path | Custom project directory |
💡 The generated
ai-screen-map.jsonis committed to your repo — no runtime cost.
🧠 Knowledge Base
Give the AI domain knowledge it can query on demand — policies, FAQs, product details. Uses a query_knowledge tool to fetch only relevant entries (no token waste).
Static Array
import type { KnowledgeEntry } from '@mobileai/react-native'; // or 'react-native-agentic-ai'
const KNOWLEDGE: KnowledgeEntry[] = [
{
id: 'shipping',
title: 'Shipping Policy',
content: 'Free shipping on orders over $75. Standard: 5-7 days. Express: 2-3 days.',
tags: ['shipping', 'delivery'],
},
{
id: 'returns',
title: 'Return Policy',
content: '30-day returns on all items. Refunds in 5-7 business days.',
tags: ['return', 'refund'],
screens: ['product/[id]', 'order-history'], // only surface on these screens
},
];
<AIAgent knowledgeBase={KNOWLEDGE} />Custom Retriever — Bring Your Own Search
<AIAgent
knowledgeBase={{
retrieve: async (query: string, screenName?: string) => {
const results = await fetch(`/api/knowledge?q=${query}&screen=${screenName}`);
return results.json();
},
}}
/>🔌 MCP Bridge Setup — Connect AI Editors to Your App
Architecture
┌──────────────────┐ ┌──────────────────┐ WebSocket ┌──────────────────┐
│ Antigravity │ Streamable HTTP │ │ │ │
│ Claude Desktop │ ◄──────────────► │ @mobileai/ │ ◄─────────────► │ Your React │
│ or any MCP │ (port 3100) │ mcp-server │ (port 3101) │ Native App │
│ compatible AI │ + Legacy SSE │ │ │ │
└──────────────────┘ └──────────────────┘ └──────────────────┘Setup in 3 Steps
1. Start the MCP bridge — no install needed:
npx @mobileai/mcp-server2. Connect your React Native app:
<AIAgent
apiKey="YOUR_API_KEY"
mcpServerUrl="ws://localhost:3101"
/>3. Connect your AI:
Add to ~/.gemini/antigravity/mcp_config.json:
{
"mcpServers": {
"mobile-app": {
"command": "npx",
"args": ["@mobileai/mcp-server"]
}
}
}Click Refresh in MCP Store. You'll see mobile-app with 2 tools: execute_task and get_app_status.
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"mobile-app": {
"url": "http://localhost:3100/mcp/sse"
}
}
}- Streamable HTTP:
http://localhost:3100/mcp - Legacy SSE:
http://localhost:3100/mcp/sse
MCP Tools
| Tool | Description |
|------|-------------|
| execute_task(command) | Send a natural language command to the app |
| get_app_status() | Check if the React Native app is connected |
Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| MCP_PORT | 3100 | HTTP port for MCP clients |
| WS_PORT | 3101 | WebSocket port for the React Native app |
🔌 API Reference
<AIAgent> Props
| Prop | Type | Default | Description |
|------|------|---------|-------------|
| apiKey | string | — | API key for your provider (prototyping only). |
| provider | 'gemini' \| 'openai' | 'gemini' | LLM provider for text mode. |
| proxyUrl | string | — | Backend proxy URL (production). |
| proxyHeaders | Record<string, string> | — | Auth headers for proxy. |
| voiceProxyUrl | string | — | Dedicated proxy for Voice Mode WebSockets. |
| voiceProxyHeaders | Record<string, string> | — | Auth headers for voice proxy. |
| model | string | Provider default | Model name (e.g. gemini-2.5-flash, gpt-4.1-mini). |
| navRef | NavigationContainerRef | — | Navigation ref for auto-navigation. |
| maxSteps | number | 25 | Max agent steps per task. |
| maxTokenBudget | number | — | Max total tokens before auto-stopping the agent loop. |
| maxCostUSD | number | — | Max estimated cost (USD) before auto-stopping. |
| showChatBar | boolean | true | Show the floating chat bar. |
| enableVoice | boolean | true | Enable voice mode tab. |
| enableUIControl | boolean | true | When false, AI becomes knowledge-only. |
| screenMap | ScreenMap | — | Pre-generated screen map from generate-map CLI. |
| useScreenMap | boolean | true | Set false to disable screen map without removing the prop. |
| instructions | { system?, getScreenInstructions? } | — | Custom system prompt + per-screen instructions. |
| customTools | Record<string, ToolDefinition \| null> | — | Override or remove built-in tools. |
| knowledgeBase | KnowledgeEntry[] \| KnowledgeRetriever | — | Domain knowledge the AI can query. |
| knowledgeMaxTokens | number | 2000 | Max tokens for knowledge results. |
| mcpServerUrl | string | — | WebSocket URL for MCP bridge. |
| accentColor | string | — | Accent color for the chat bar. |
| theme | ChatBarTheme | — | Full chat bar color customization. |
| onResult | (result) => void | — | Called when agent finishes. |
| onBeforeStep | (stepCount) => void | — | Called before each step. |
| onAfterStep | (history) => void | — | Called after each step. |
| onTokenUsage | (usage) => void | — | Token usage per step. |
| onAskUser | (question) => Promise<string> | — | Handle ask_user inline — agent waits for your response. |
| stepDelay | number | — | Delay between steps (ms). |
| router | { push, replace, back } | — | Expo Router instance. |
| pathname | string | — | Current pathname (Expo Router). |
| debug | boolean | false | Enable SDK debug logging. |
🎨 Customization
// Quick — one color:
<AIAgent accentColor="#6C5CE7" />
// Full theme:
<AIAgent
accentColor="#6C5CE7"
theme={{
backgroundColor: 'rgba(44, 30, 104, 0.95)',
inputBackgroundColor: 'rgba(255, 255, 255, 0.12)',
textColor: '#ffffff',
successColor: 'rgba(40, 167, 69, 0.3)',
errorColor: 'rgba(220, 53, 69, 0.3)',
}}
/>useAction — Custom AI-Callable Business Logic
import { useAction } from '@mobileai/react-native'; // or 'react-native-agentic-ai'
function CartScreen() {
const { cart, clearCart, getTotal } = useCart();
useAction('checkout', 'Place the order and checkout', {}, async () => {
if (cart.length === 0) return { success: false, message: 'Cart is empty' };
// Human-in-the-loop: AI pauses until user taps Confirm
return new Promise((resolve) => {
Alert.alert('Confirm Order', `Place order for $${getTotal()}?`, [
{ text: 'Cancel', onPress: () => resolve({ success: false, message: 'User denied.' }) },
{ text: 'Confirm', onPress: () => { clearCart(); resolve({ success: true, message: `Order placed!` }); } },
]);
});
});
}useAI — Headless / Custom Chat UI
import { useAI } from '@mobileai/react-native'; // or 'react-native-agentic-ai'
function CustomChat() {
const { send, isLoading, status, messages } = useAI();
return (
<View style={{ flex: 1 }}>
<FlatList data={messages} renderItem={({ item }) => <Text>{item.content}</Text>} />
{isLoading && <Text>{status}</Text>}
<TextInput onSubmitEditing={(e) => send(e.nativeEvent.text)} placeholder="Ask the AI..." />
</View>
);
}Chat history persists across navigation. Override settings per-screen:
const { send } = useAI({
enableUIControl: false,
onResult: (result) => router.push('/(tabs)/chat'),
});🔒 Security & Production
Backend Proxy — Keep API Keys Secure
<AIAgent
proxyUrl="https://myapp.vercel.app/api/gemini"
proxyHeaders={{ Authorization: `Bearer ${userToken}` }}
voiceProxyUrl="https://voice-server.render.com" // only if text proxy is serverless
navRef={navRef}
>
voiceProxyUrlfalls back toproxyUrlif not set. Only needed when your text API is on a serverless platform that can't hold WebSocket connections.
import { NextResponse } from 'next/server';
export async function POST(req: Request) {
const body = await req.json();
const response = await fetch('https://generativelanguage.googleapis.com/...', {
method: 'POST',
headers: { 'Content-Type': 'application/json', 'x-goog-api-key': process.env.GEMINI_API_KEY! },
body: JSON.stringify(body),
});
return NextResponse.json(await response.json());
}const express = require('express');
const { createProxyMiddleware } = require('http-proxy-middleware');
const app = express();
const geminiProxy = createProxyMiddleware({
target: 'https://generativelanguage.googleapis.com',
changeOrigin: true,
ws: true,
pathRewrite: (path) => `${path}${path.includes('?') ? '&' : '?'}key=${process.env.GEMINI_API_KEY}`,
});
app.use('/v1beta/models', geminiProxy);
const server = app.listen(3000);
server.on('upgrade', geminiProxy.upgrade);Element Gating — Hide Elements from AI
<Pressable aiIgnore={true}><Text>Admin Panel</Text></Pressable>Content Masking — Sanitize Before LLM Sees It
<AIAgent transformScreenContent={(c) => c.replace(/\b\d{13,16}\b/g, '****-****-****-****')} />Screen-Specific Instructions
<AIAgent instructions={{
system: 'You are a food delivery assistant.',
getScreenInstructions: (screen) => screen === 'Cart' ? 'Confirm total before checkout.' : undefined,
}} />Lifecycle Hooks
| Hook | When |
|------|------|
| onBeforeStep | Before each agent step |
| onAfterStep | After each step (with full history) |
| onBeforeTask | Before task execution |
| onAfterTask | After task completes |
🛠️ Built-in Tools
| Tool | What it does |
|------|-------------|
| tap(index) | Tap any interactive element — buttons, switches, checkboxes, custom components |
| long_press(index) | Long-press an element to trigger context menus |
| type(index, text) | Type into a text input |
| scroll(direction, amount?) | Scroll content — auto-detects edge, rejects PagerView |
| slider(index, value) | Drag a slider to a specific value |
| picker(index, value) | Select a value from a dropdown/picker |
| date_picker(index, date) | Set a date on a date picker |
| navigate(screen) | Navigate to any screen |
| wait(seconds) | Wait for loading states before acting |
| capture_screenshot(reason) | Capture the screen as an image (requires react-native-view-shot) |
| done(text) | Finish the task with a response |
| ask_user(question) | Ask the user for clarification |
| query_knowledge(question) | Search the knowledge base |
📋 Requirements
- React Native 0.72+
- Expo SDK 49+ (or bare React Native)
- Gemini API key — Get one free, or
- OpenAI API key — Get one
Gemini is the default provider and powers all modes (text + voice). OpenAI is available as a text mode alternative via
provider="openai". Voice mode usesgemini-2.5-flash-native-audio-preview(Gemini only).
📄 License
MIT © Mohamed Salah
👋 Let's connect — LinkedIn
