npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

react-native-ai-core

v0.5.8

Published

On-device LLM inference for React Native — Gemma 4, Gemma 3, Phi-4, Mistral, Llama 3 and more via Google AI Edge LiteRT-LM, ML Kit AICore and MediaPipe

Readme

react-native-ai-core

npm version License: MIT Platform: Android New Architecture

⚠️ Development in Progress — This library is under active development. APIs may change between minor versions. Not recommended for production use yet.

On-device LLM inference for React Native using Google's AI Edge SDKs. Runs entirely on-device — no internet connection required, full privacy.

Three backends, one API:

| Backend | Constant | When it activates | Hardware | |---|---|---|---| | LiteRT-LM | Engine.LITERTLM | modelPath points to a .litertlm file | CPU (any Android ≥ 26) | | ML Kit AICore | Engine.AICORE | modelPath = '' | NPU — Pixel 9+, Android 14+ | | MediaPipe | Engine.MEDIAPIPE | modelPath points to a .bin file | CPU (any Android ≥ 26) |

Built with TurboModules (New Architecture), JSI bridge, zero-overhead streaming via NativeEventEmitter.


Requirements

| | Minimum | |---|---| | React Native | 0.76+ (New Architecture) | | Android API | 26 (Android 8) | | Kotlin | 1.9+ |

Testing requires a physical Android device. The Android emulator does not support the AI Edge SDKs.


Installation

npm install react-native-ai-core
# or
yarn add react-native-ai-core

Add to android/gradle.properties:

minSdkVersion=26

The native module is auto-linked. The library's AndroidManifest.xml is merged automatically and declares:

  • INTERNET — required for downloading models from HuggingFace Hub
  • FOREGROUND_SERVICE / FOREGROUND_SERVICE_DATA_SYNC — keeps the process alive during background generation (Android 14+)
  • READ_EXTERNAL_STORAGE (maxSdkVersion=32) — only required on Android ≤ 12L when using a custom model path outside the app's private directory

Quick Start

import AICore, { Engine, KnownModel } from 'react-native-ai-core';

// 1. Programmatic initialization — download model if needed, then initialize
await AICore.ensureModel(KnownModel.GEMMA4_2B, {
  hfToken: 'hf_…',          // only needed for gated models
  onProgress: (p) => console.log(`${Math.round(p.receivedBytes / p.totalBytes * 100)}%`),
  onStatus: (s) => console.log(s), // 'checking' | 'downloading' | 'initializing' | 'ready'
});

// 2. Generate a response
const answer = await AICore.generateResponse('What is React Native?');
console.log(answer);

// 3. Release resources when done
await AICore.release();

Manual initialization (path-based)

// ML Kit AICore backend (Gemini Nano NPU, Pixel 9+)
await AICore.initialize('');

// LiteRT-LM backend (any .litertlm file)
await AICore.initialize('/data/user/0/com.myapp/files/ai-core-models/Gemma-4-E2B-it/.../gemma-4-E2B-it.litertlm');

// MediaPipe backend (any .bin file)
await AICore.initialize('/sdcard/Download/model.bin');

Streaming

const unsubscribe = AICore.generateResponseStream('Explain JSI in detail', {
  onToken:    (token, done) => process.stdout.write(token),
  onComplete: () => console.log('[done]'),
  onError:    (err) => console.error(err.code, err.message),
});

// Remove listeners on unmount
return () => unsubscribe();

Structured output

import { z } from 'zod';
import { generateStructuredResponse } from 'react-native-ai-core';

const TicketSchema = z.object({
  category: z.enum(['bug', 'billing', 'feature']),
  priority: z.enum(['low', 'medium', 'high']),
  summary: z.string(),
  needsHuman: z.boolean(),
});

const ctrl = new AbortController();

const result = await generateStructuredResponse({
  prompt: 'Classify this support request.',
  input: { message: 'The app crashes when I try to export a PDF invoice.' },
  output: TicketSchema,
  signal: ctrl.signal,
});

API Reference

Core methods

| Method | Returns | Description | |---|---|---| | initialize(modelPath) | Promise<boolean> | Start the engine. Pass '' for AICore NPU, a .litertlm path for LiteRT-LM, or a .bin path for MediaPipe | | release() | Promise<void> | Free the model from memory | | resetConversation() | Promise<void> | Clear chat history without releasing the model | | checkAvailability() | Promise<AvailabilityStatus> | Check device support for ML Kit AICore (NPU path) | | generateResponse(prompt) | Promise<string> | Blocking inference — waits for the complete response | | generateResponseStream(prompt, callbacks) | () => void | Streaming inference — delivers tokens via NativeEventEmitter. Returns a cleanup function | | cancelGeneration() | Promise<void> | Stop an in-progress generateResponse or generateResponseStream call | | generateResponseStateless(prompt) | Promise<string> | One-shot inference that does not pollute conversation history | | generateStructuredResponse(options) | Promise<T> | Generate and validate structured JSON output against a Zod schema |

Model management

| Method | Returns | Description | |---|---|---| | ensureModel(model, opts?) | Promise<void> | Full lifecycle helper: check on-device → download if missing → initialize. Accepts a KnownModelEntry | | isModelDownloaded(model) | Promise<{ downloaded, path? }> | Check whether a model file is already on the device. Accepts a KnownModelEntry or a raw name string | | getInitializedModel() | Promise<{ engine, modelPath } \| null> | Query the currently loaded engine and model path. Returns null when idle | | fetchModelCatalog(version?) | Promise<ModelCatalogEntry[]> | Fetch the Google AI Edge model catalog from GitHub | | downloadModel(entry, token?, onProgress?) | Promise<string> | Download a catalog entry from HuggingFace Hub. Returns the local file path | | getDownloadedModels() | Promise<DownloadedModel[]> | List all models stored in the app's private files directory | | cancelDownload() | void | Cancel an ongoing downloadModel call |

System prompt

| Method | Returns | Description | |---|---|---| | setSystemPrompt(prompt) | Promise<void> | Inject a persistent system-level instruction that is prepended to every subsequent generation call. Has no visible effect in chat history | | clearSystemPrompt() | Promise<void> | Remove the active system prompt |

Runtime configuration

| Method | Returns | Description | |---|---|---| | configure(options) | Promise<void> | Update inference parameters at runtime without reinitialising the engine. See AICoreConfig |

Token estimation

| Method | Returns | Description | |---|---|---| | getTokenCount(text) | Promise<number> | Estimate the token count for a given string. Useful for checking context-window budget before sending large inputs |


Enums & Constants

Engine

Type-safe backend selector. Used in KnownModelEntry and getInitializedModel().

import { Engine } from 'react-native-ai-core';

Engine.AICORE     // 'aicore'    — ML Kit NPU (Pixel 9+)
Engine.LITERTLM   // 'litertlm' — file-based LiteRT-LM (any Android ≥ 26)
Engine.MEDIAPIPE  // 'mediapipe' — file-based MediaPipe (any Android ≥ 26)

KnownModel

A curated registry of models from the Google AI Edge catalog. Pass any entry directly to ensureModel() or isModelDownloaded().

import { KnownModel } from 'react-native-ai-core';

await AICore.ensureModel(KnownModel.GEMMA4_2B, { hfToken });

| Key | Display name | Engine | Size | Notes | |---|---|---|---|---| | GEMINI_NANO | Gemini Nano | AICORE | — | Native NPU only (Pixel 9+). No download needed | | GEMMA4_2B | Gemma 4 2B | LITERTLM | ~2.4 GB | Recommended. Multimodal (text/image/audio), 32 K context | | GEMMA4_4B | Gemma 4 4B | LITERTLM | ~3.4 GB | Higher quality, requires ≥ 12 GB RAM | | GEMMA3N_2B | Gemma 3n 2B | LITERTLM | ~3.4 GB | Multimodal (text/image/audio), 4 K context | | GEMMA3N_4B | Gemma 3n 4B | LITERTLM | ~4.6 GB | Multimodal, requires ≥ 12 GB RAM | | GEMMA3_1B | Gemma 3 1B | LITERTLM | ~0.55 GB | Fastest, lowest RAM requirement | | QWEN25_1B5 | Qwen 2.5 1.5B | LITERTLM | ~1.5 GB | Good multilingual support | | DEEPSEEK_R1_1B5 | DeepSeek R1 1.5B | LITERTLM | ~1.7 GB | Reasoning-capable distilled model |

All LITERTLM models are sourced from the google-ai-edge / litert-community organisations on HuggingFace. ensureModel() handles the download automatically. See fetchModelCatalog() to browse the full list.


Types

export type AvailabilityStatus =
  | 'AVAILABLE'
  | 'AVAILABLE_NPU'
  | 'NEED_DOWNLOAD'
  | 'UNSUPPORTED';

export type Engine = 'aicore' | 'litertlm' | 'mediapipe';

export interface KnownModelEntry {
  name: string;          // Human-readable display name
  engine: Engine;
  modelId: string;       // HuggingFace repo ID
  sizeGb: number;        // Approximate download size
  catalogName?: string;  // Exact name in the Google AI Edge catalog JSON (= directory name on disk)
}

export interface EnsureModelOptions {
  hfToken?: string;                          // HuggingFace access token (for gated models)
  onProgress?: DownloadProgressCallback;     // Download progress events
  onStatus?: (status: EnsureModelStatus) => void; // 'checking' | 'downloading' | 'initializing' | 'ready'
  catalogVersion?: string;                   // Override catalog version (default: '1_0_11')
}

export type EnsureModelStatus =
  | 'checking'
  | 'downloading'
  | 'initializing'
  | 'ready';

export interface DownloadProgressEvent {
  receivedBytes: number;
  totalBytes: number;
  bytesPerSecond: number;
  remainingMs: number;
}

export type DownloadProgressCallback = (progress: DownloadProgressEvent) => void;

export interface ModelCatalogEntry {
  name: string;
  modelId: string;
  modelFile: string;
  commitHash: string;
  sizeInBytes: number;
  description?: string;
  minDeviceMemoryInGb?: number;
}

export interface DownloadedModel {
  name: string;
  commitHash: string;
  fileName: string;
  path: string;        // Absolute path — pass directly to initialize()
  sizeInBytes: number;
}

export interface StreamCallbacks {
  onToken:    (token: string, done: boolean) => void;
  onComplete: () => void;
  onError:    (error: AIError) => void;
}

export interface AIError {
  code:    string;
  message: string;
}

export interface AICoreConfig {
  inferenceTimeoutSec?: number; // [30–3600], default 420
  temperature?: number;         // [0.0–2.0], default 0.7
  topK?: number;                // [1–256], default 64
  maxContinuations?: number;    // [0–50], default 12
}

AICoreConfig

Passed to configure(). All fields are optional — omit any field (or pass -1) to keep the current value.

| Field | Type | Default | Range | Notes | |---|---|---|---|---| | inferenceTimeoutSec | number | 420 | 30–3600 | Seconds before inference is aborted. Increase for long-form responses (e.g. meal plans, code generation) | | temperature | number | 0.7 | 0.0–2.0 | Sampling randomness. For LiteRT-LM: takes effect on the next resetConversation() or initialize() | | topK | number | 64 | 1–256 | Top-K sampling. For LiteRT-LM: takes effect on the next resetConversation() or initialize() | | maxContinuations | number | 12 | 0–50 | Maximum auto-continuation passes. Only applies to the aicore (Gemini Nano / ML Kit) engine |

// Increase timeout for a very long document summary
await AICore.configure({ inferenceTimeoutSec: 900 });

// More creative, less deterministic
await AICore.configure({ temperature: 1.2, topK: 80 });

// Mix — only the fields you pass are changed
await AICore.configure({ inferenceTimeoutSec: 600, temperature: 0.5 });

generateStructuredResponse options

| Option | Type | Default | Description | |---|---|---|---| | prompt | string | — | Natural language instruction | | input | unknown | — | Optional structured input object serialized to JSON before the prompt | | inputSchema | ZodType | — | Optional Zod schema to validate input before generation | | output | ZodType | — | Required. Schema to validate the model output | | strategy | 'single' \| 'chunked' | 'single' | 'single' generates all JSON in one call; 'chunked' walks the schema field-by-field | | maxRetries | number | 2 | Repair-prompt attempts when output fails schema validation | | maxContinuations | number | 8 | Max continuation calls when JSON is truncated mid-output | | timeoutMs | number | 300000 | Per-call timeout in ms | | onProgress | (field, done) => void | — | Called per field during 'chunked' strategy | | signal | AbortSignal | — | Cancel mid-generation. Rejects with Error { name: 'AbortError' } |

Cancellation example:

const ctrl = new AbortController();
const promise = generateStructuredResponse({ ..., signal: ctrl.signal });
ctrl.abort(); // cancel from a button handler

Programmatic initialization example

For apps that manage the full lifecycle in code (e.g. background services, non-chat apps), ensureModel() provides a one-call solution:

import AICore, { KnownModel } from 'react-native-ai-core';

await AICore.ensureModel(KnownModel.GEMMA3_1B, {
  onStatus: (s) => console.log(s),
  onProgress: ({ receivedBytes, totalBytes }) =>
    console.log(`${((receivedBytes / totalBytes) * 100).toFixed(0)}%`),
});

// Set a persistent persona
await AICore.setSystemPrompt(
  'You are a precise JSON extractor. Respond only with valid JSON, no markdown.'
);

// Estimate context budget
const tokens = await AICore.getTokenCount(myLargeDocument);
if (tokens > 3500) throw new Error('Document too large for context window');

// One-shot inference without polluting chat history
const json = await AICore.generateResponseStateless(
  `Extract invoice data from:\n${myLargeDocument}`
);

// Query what is currently loaded
const model = await AICore.getInitializedModel();
// { engine: 'litertlm', modelPath: '/data/user/0/...' }

await AICore.clearSystemPrompt();
await AICore.release();

useAICore Hook

A ready-to-use React hook in the example app handles:

  • Engine lifecycle (initialize / release)
  • Streaming with incremental message updates
  • Conversation history
  • Error state management
  • stopGeneration() to abort in-progress responses

See example/src/hooks/useAICore.ts.


Backends

LiteRT-LM (Engine.LITERTLM)

Used when modelPath ends with .litertlm. Works on any Android ≥ 26. Models are downloaded from HuggingFace Hub (use ensureModel() or fetchModelCatalog() + downloadModel()) and stored in the app's private external directory — no special permissions required.

Dependency: com.google.ai.edge.litertlm:litertlm-android:0.10.0

ML Kit AICore (Engine.AICORE)

Used when modelPath = ''. Requires a Pixel 9 or a device with Android AICore. Model is managed by Google via system updates — no file download needed.

Dependency: com.google.mlkit:genai-prompt:1.0.0-beta2

MediaPipe Tasks GenAI (Engine.MEDIAPIPE)

Used when modelPath ends with .bin. Works on any Android ≥ 26.

adb push gemini-nano.bin /data/local/tmp/gemini-nano.bin

Dependency: com.google.mediapipe:tasks-genai:0.10.22


Roadmap

  • [x] LiteRT-LM backend — Gemma 4, Gemma 3n, Gemma 3, Qwen, DeepSeek and more
  • [x] ML Kit AICore backend — Gemini Nano NPU (Pixel 9+)
  • [x] MediaPipe Tasks GenAI backend
  • [x] Model catalog + download from HuggingFace Hub
  • [x] HuggingFace token stored securely (Android Keystore via expo-secure-store)
  • [x] Abort/cancel streaming mid-generation
  • [x] Structured JSON output with Zod validation and repair retries
  • [x] ensureModel() — programmatic check → download → initialize lifecycle
  • [x] setSystemPrompt() / clearSystemPrompt() — persistent session instruction
  • [x] getTokenCount() — context-window budget estimation
  • [x] generateResponseStateless() — one-shot inference without history pollution
  • [x] getInitializedModel() — query what is currently loaded
  • [x] KnownModel registry with Engine enum
  • [x] configure() — tune timeout, temperature, topK and continuation passes at runtime
  • [ ] iOS support (Core ML / Apple Neural Engine)
  • [ ] Model quantization options (INT4, INT8)
  • [ ] Web support (WebGPU / WASM)
  • [ ] Warm-up inference after initialize() to amortize JIT/page-cache cost

Contributing

Contributions, issues and feature requests are welcome.


License

MIT © Alberto Fernandez


Built with react-native-builder-bob