capacitor-gemma-3n

v1.0.0

Published

4 months ago

Capacitor plugin for Google Gemma 3n on-device AI via MediaPipe LLM Inference API

0High
0Medium
0Low

alloxrinfo

capacitor plugin native gemma gemma-3n mediapipe llm ai on-device multimodal speech-to-text image-recognition

capacitor-gemma-3n

Capacitor plugin for running Google Gemma 3n on-device AI models on Android using LiteRT-LM.

Features

On-device inference - No internet required after model download
Text generation with streaming support
Audio transcription (batch processing)
Image understanding (vision) - structure ready, implementation pending
Multimodal inputs - text + audio + image

Requirements

Android 8.0+ (API 26+)
~4-5GB RAM available for E4B model
Gemma 3n model file (.litertlm format)
Capacitor 7.x

Installation

npm install capacitor-gemma-3n
npx cap sync android

Model Setup

Download the Gemma 3n model from HuggingFace or copy from Google AI Edge Gallery.

Place the model file at:

/sdcard/Android/data/YOUR_APP_ID/files/gemma-3n-E4B-it-int4.litertlm

The plugin searches for the model in these locations:

<internal_files>/gemma3n_models/gemma-3n-E4B-it-int4.litertlm
<external_files>/gemma-3n-E4B-it-int4.litertlm
/sdcard/Download/gemma-3n-E4B-it-int4.litertlm
Google AI Edge Gallery directory

Usage

Check Availability

import { Gemma3n } from 'capacitor-gemma-3n';

const { available, status, message } = await Gemma3n.isAvailable();
console.log('Gemma 3n available:', available);

Initialize Model

const result = await Gemma3n.initialize({
  variant: 'e4b',  // 'e4b' (4-bit, more accurate) or 'e2b' (2-bit, smaller)
  maxTokens: 1024
});

if (result.success) {
  console.log('Model loaded!');
}

Generate Text

const response = await Gemma3n.generateText({
  prompt: 'What is the capital of France?',
  systemPrompt: 'You are a helpful assistant.',
  maxTokens: 512
});

console.log(response.text);
console.log(`Generated in ${response.timeMs}ms`);

Streaming Generation

// Set up listener
const listener = await Gemma3n.addListener('streamResponse', (event) => {
  if (event.error) {
    console.error('Error:', event.error);
    return;
  }

  console.log('Partial:', event.partialText);

  if (event.done) {
    console.log('Complete!');
    listener.remove();
  }
});

// Start streaming
await Gemma3n.generateTextStream({
  prompt: 'Write a short story about a robot.',
  systemPrompt: 'You are a creative writer.'
});

Audio Transcription

// From base64 audio (WAV 16kHz mono recommended)
const response = await Gemma3n.generateFromAudio({
  audioBase64: 'BASE64_ENCODED_WAV',
  prompt: 'Transcribe this audio:'
});

console.log('Transcription:', response.text);

// Or with streaming output
const listener = await Gemma3n.addListener('streamResponse', (event) => {
  console.log(event.partialText);
  if (event.done) listener.remove();
});

await Gemma3n.generateFromAudioStream({
  audioBase64: 'BASE64_ENCODED_WAV',
  prompt: 'Transcribe this audio in French:'
});

Get Model Info

const info = await Gemma3n.getModelInfo();
console.log('Loaded:', info.loaded);
console.log('Variant:', info.variant);
console.log('Memory usage:', info.memoryUsageMB, 'MB');
console.log('Device info:', info.deviceInfo);

Unload Model

await Gemma3n.unloadModel();

API Reference

Methods

| Method | Description | |--------|-------------| | isAvailable() | Check if Gemma 3n is available on device | | initialize(options?) | Load the model into memory | | generateText(options) | Generate text response | | generateTextStream(options) | Generate with streaming output | | generateFromAudio(options) | Transcribe/process audio | | generateFromAudioStream(options) | Audio with streaming output | | generateFromImage(options) | Process image (coming soon) | | cancelGeneration(options) | Cancel ongoing generation | | getModelInfo() | Get model and device info | | unloadModel() | Free model from memory |

Events

| Event | Description | |-------|-------------| | streamResponse | Partial text during streaming | | downloadProgress | Model download progress (future) |

Interfaces

interface InitializeOptions {
  variant?: 'e4b' | 'e2b';
  maxTokens?: number;
  temperature?: number;
  topK?: number;
}

interface GenerateTextOptions {
  prompt: string;
  systemPrompt?: string;
  maxTokens?: number;
}

interface GenerateFromAudioOptions {
  audioPath?: string;
  audioBase64?: string;
  prompt?: string;
  maxTokens?: number;
}

interface GenerateResponse {
  text: string;
  tokenCount: number;
  timeMs: number;
  truncated: boolean;
}

interface StreamResponseEvent {
  sessionId: string;
  partialText: string;
  done: boolean;
  error?: string;
}

interface ModelInfo {
  loaded: boolean;
  variant: 'e4b' | 'e2b' | null;
  memoryUsageMB: number;
  deviceInfo: {
    supportsGemma3n: boolean;
    availableMemoryMB: number;
    cpuCores: number;
    hasNPU: boolean;
  };
}

Performance

Tested on Pixel 9 Pro XL:

Model load time: ~5-10 seconds
Generation speed: ~10-20 tokens/second (CPU)
Memory usage: ~3-4GB

Limitations

Android only - iOS not supported (no LiteRT-LM)
Audio: Batch processing only, not real-time streaming STT
Audio format: WAV 16kHz mono recommended
Vision: Requires GPU backend (implementation pending)
Model size: ~4.9GB for E4B variant

Troubleshooting

"Model not found"

Ensure the model file is in the correct location with the exact filename gemma-3n-E4B-it-int4.litertlm.

"Out of memory"

Close other apps to free RAM. E4B needs ~4GB available.

Audio "miniaudio decoder error"

Convert audio to WAV format (16kHz, 16-bit, mono).

Kotlin version conflict

The plugin uses Kotlin 2.2.21 and avoids coroutines to prevent version conflicts with other plugins.

Technical Notes

Uses java.util.concurrent.Executors instead of Kotlin Coroutines (avoids version conflicts)
Streaming via Capacitor's notifyListeners
Engine configured with audioBackend = Backend.CPU for multimodal support

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

capacitor-gemma-3n

Features

Requirements

Installation

Model Setup

Usage

Check Availability

Initialize Model

Generate Text

Streaming Generation

Audio Transcription

Get Model Info

Unload Model

API Reference

Methods

Events

Interfaces

Performance

Limitations

Troubleshooting

"Model not found"

"Out of memory"

Audio "miniaudio decoder error"

Kotlin version conflict

Technical Notes

License

Credits