npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@cantoo/capacitor-llama

v0.1.2

Published

Capacitor binding of llama.cpp

Readme

capacitor-llama

Capacitor implementation for llama models

Install

npm i @cantoo/capacitor-llama
npx cap sync

API

initContext(...)

initContext(options: ContextParams & { id: number; }) => Promise<NativeLlamaContext>

| Param | Type | | ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | options | Omit<NativeContextParams, 'cache_type_k' | 'cache_type_v' | 'pooling_type'> & { cache_type_k?: 'f16' | 'f32' | 'q8_0' | 'q4_0' | 'q4_1' | 'iq4_nl' | 'q5_0' | 'q5_1'; cache_type_v?: 'f16' | 'f32' | 'q8_0' | 'q4_0' | 'q4_1' | 'iq4_nl' | 'q5_0' | 'q5_1'; pooling_type?: 'none' | 'mean' | 'cls' | 'last' | 'rank'; } & { id: number; } |

Returns: Promise<NativeLlamaContext>


completion(...)

completion(options: { id: number; params: CompletionParams; }) => Promise<NativeCompletionResult>

| Param | Type | | ------------- | -------------------------------------------------------------------------------------- | | options | { id: number; params: CompletionParams; } |

Returns: Promise<NativeCompletionResult>


stopCompletion(...)

stopCompletion(options: { id: number; }) => Promise<void>

| Param | Type | | ------------- | ---------------------------- | | options | { id: number; } |


releaseContext(...)

releaseContext(options: { id: number; }) => Promise<void>

| Param | Type | | ------------- | ---------------------------- | | options | { id: number; } |


releaseAllContexts()

releaseAllContexts() => Promise<void>

tokenize(...)

tokenize(options: { id: number; text: string; specialTokens?: boolean; }) => Promise<{ tokens: number[]; }>

| Param | Type | | ------------- | ------------------------------------------------------------------- | | options | { id: number; text: string; specialTokens?: boolean; } |

Returns: Promise<{ tokens: number[]; }>


detokenize(...)

detokenize(options: { id: number; tokens: number[]; }) => Promise<{ text: string; }>

| Param | Type | | ------------- | ---------------------------------------------- | | options | { id: number; tokens: number[]; } |

Returns: Promise<{ text: string; }>


getVocab(...)

getVocab(options: { id: number; }) => Promise<{ vocab: string[]; }>

| Param | Type | | ------------- | ---------------------------- | | options | { id: number; } |

Returns: Promise<{ vocab: string[]; }>


addListener('onToken', ...)

addListener(eventName: 'onToken', listenerFunc: TokenCallback) => Promise<PluginListenerHandle>

| Param | Type | | ------------------ | ------------------------------------------------------- | | eventName | 'onToken' | | listenerFunc | TokenCallback |

Returns: Promise<PluginListenerHandle>


removeAllListeners()

removeAllListeners() => Promise<void>

Removes all listeners


Interfaces

PluginListenerHandle

| Prop | Type | | ------------ | ----------------------------------------- | | remove | () => Promise<void> |

Type Aliases

NativeLlamaContext

{ model: { desc: string; size: number; nEmbd: number; nParams: number; nVocab: number; chatTemplates: { llamaChat: boolean; // Chat template in llama-chat.cpp minja: { // Chat template supported by minja.hpp default: boolean; defaultCaps: { tools: boolean; toolCalls: boolean; toolResponses: boolean; systemRole: boolean; parallelToolCalls: boolean; toolCallId: boolean; }; toolUse: boolean; toolUseCaps: { tools: boolean; toolCalls: boolean; toolResponses: boolean; systemRole: boolean; parallelToolCalls: boolean; toolCallId: boolean; }; }; }; metadata: Record<string, unknown>; isChatTemplateSupported: boolean; // Deprecated }; /** * Loaded library name for Android */ androidLib?: string; gpu: boolean; reasonNoGPU: string; }

Record

Construct a type with a set of properties K of type T

{ [P in K]: T; }

ContextParams

Omit<NativeContextParams, 'cache_type_k' | 'cache_type_v' | 'pooling_type'> & { cache_type_k?: 'f16' | 'f32' | 'q8_0' | 'q4_0' | 'q4_1' | 'iq4_nl' | 'q5_0' | 'q5_1'; cache_type_v?: 'f16' | 'f32' | 'q8_0' | 'q4_0' | 'q4_1' | 'iq4_nl' | 'q5_0' | 'q5_1'; pooling_type?: 'none' | 'mean' | 'cls' | 'last' | 'rank'; }

Omit

Construct a type with the properties of T except for those in type K.

Pick<T, Exclude<keyof T, K>>

Pick

From T, pick a set of properties whose keys are in the union K

{ [P in K]: T[P]; }

Exclude

Exclude from T those types that are assignable to U

T extends U ? never : T

NativeContextParams

{ /** * For android, iOS and electron the model is an absolute path. Example: /path/to/my-model.gguf * * For web the model is an url. Example: https://huggingface.co/bartowski/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/Qwen2.5-0.5B-Instruct-Q5_K_S.gguf / model: string; /* * Note: This option is available only on web. * * Controls whether to load the model from the browser cache. * * - If true (default), the model manager first tries to load the model from cache. * If it's not cached, it will download the model and cache it for future use. * * - If false, the model will always be downloaded, ignoring any cached versions. / readFromCache?: boolean; /* * Chat template to override the default one from the model. / chat_template?: string; reasoning_format?: string; is_model_asset?: boolean; use_progress_callback?: boolean; n_ctx?: number; n_batch?: number; n_ubatch?: number; n_threads?: number; /* * Number of layers to store in VRAM (Currently only for iOS) / n_gpu_layers?: number; /* * Skip GPU devices (iOS only) / no_gpu_devices?: boolean; /* * Enable flash attention, only recommended in GPU device (Experimental in llama.cpp) / flash_attn?: boolean; /* * KV cache data type for the K (Experimental in llama.cpp) / cache_type_k?: string; /* * KV cache data type for the V (Experimental in llama.cpp) / cache_type_v?: string; use_mlock?: boolean; use_mmap?: boolean; vocab_only?: boolean; /* * Single LoRA adapter path / lora?: string; /* * Single LoRA adapter scale / lora_scaled?: number; /* * LoRA adapter list */ lora_list?: { path: string; scaled?: number }[]; rope_freq_base?: number; rope_freq_scale?: number; pooling_type?: number; // Embedding params embedding?: boolean; embd_normalize?: number; }

NativeCompletionResult

{ /** * Original text (Ignored reasoning_content / tool_calls) / text: string; /* * Reasoning content (parsed for reasoning model) / reasoning_content: string; /* * Content text (Filtered text by reasoning_content / tool_calls) */ content: string; completion_probabilities?: NativeCompletionTokenProb[]; }

NativeCompletionTokenProb

{ content: string; probs: NativeCompletionTokenProbItem[]; }

NativeCompletionTokenProbItem

{ tok_str: string; prob: number; }

CompletionParams

Omit<NativeCompletionParams, 'prompt'> & CompletionBaseParams

NativeCompletionParams

{ prompt: string; n_threads?: number; /** * JSON schema for convert to grammar for structured JSON output. * It will be override by grammar if both are set. / json_schema?: string; /* * Set grammar for grammar-based sampling. Default: no grammar / grammar?: string; /* * Lazy grammar sampling, trigger by grammar_triggers. Default: false / grammar_lazy?: boolean; /* * Lazy grammar triggers. Default: [] / grammar_triggers?: { type: number; value: string; token: number; }[]; preserved_tokens?: string[]; chat_format?: number; /* * Specify a JSON array of stopping strings. * These words will not be included in the completion, so make sure to add them to the prompt for the next iteration. Default: [] / stop?: string[]; /* * Set the maximum number of tokens to predict when generating text. * Note: May exceed the set limit slightly if the last token is a partial multibyte character. * When 0,no tokens will be generated but the prompt is evaluated into the cache. Default: -1, where -1 is infinity. / n_predict?: number; /* * If greater than 0, the response also contains the probabilities of top N tokens for each generated token given the sampling settings. * Note that for temperature < 0 the tokens are sampled greedily but token probabilities are still being calculated via a simple softmax of the logits without considering any other sampler settings. * Default: 0 / n_probs?: number; /* * Limit the next token selection to the K most probable tokens. Default: 40 / top_k?: number; /* * Limit the next token selection to a subset of tokens with a cumulative probability above a threshold P. Default: 0.95 / top_p?: number; /* * The minimum probability for a token to be considered, relative to the probability of the most likely token. Default: 0.05 / min_p?: number; /* * Set the chance for token removal via XTC sampler. Default: 0.0, which is disabled. / xtc_probability?: number; /* * Set a minimum probability threshold for tokens to be removed via XTC sampler. Default: 0.1 (> 0.5 disables XTC) / xtc_threshold?: number; /* * Enable locally typical sampling with parameter p. Default: 1.0, which is disabled. / typical_p?: number; /* * Adjust the randomness of the generated text. Default: 0.8 / temperature?: number; /* * Last n tokens to consider for penalizing repetition. Default: 64, where 0 is disabled and -1 is ctx-size. / penalty_last_n?: number; /* * Control the repetition of token sequences in the generated text. Default: 1.0 / penalty_repeat?: number; /* * Repeat alpha frequency penalty. Default: 0.0, which is disabled. / penalty_freq?: number; /* * Repeat alpha presence penalty. Default: 0.0, which is disabled. / penalty_present?: number; /* * Enable Mirostat sampling, controlling perplexity during text generation. Default: 0, where 0 is disabled, 1 is Mirostat, and 2 is Mirostat 2.0. / mirostat?: number; /* * Set the Mirostat target entropy, parameter tau. Default: 5.0 / mirostat_tau?: number; /* * Set the Mirostat learning rate, parameter eta. Default: 0.1 / mirostat_eta?: number; /* * Set the DRY (Don't Repeat Yourself) repetition penalty multiplier. Default: 0.0, which is disabled. / dry_multiplier?: number; /* * Set the DRY repetition penalty base value. Default: 1.75 / dry_base?: number; /* * Tokens that extend repetition beyond this receive exponentially increasing penalty: multiplier * base ^ (length of repeating sequence before token - allowed length). Default: 2 / dry_allowed_length?: number; /* * How many tokens to scan for repetitions. Default: -1, where 0 is disabled and -1 is context size. / dry_penalty_last_n?: number; /* * Specify an array of sequence breakers for DRY sampling. Only a JSON array of strings is accepted. Default: ['\n', ':', '"', '*'] / dry_sequence_breakers?: string[]; /* * Top n sigma sampling as described in academic paper "Top-nσ: Not All Logits Are You Need" https://arxiv.org/pdf/2411.07641. Default: -1.0 (Disabled) / top_n_sigma?: number; /* * Ignore end of stream token and continue generating. Default: false / ignore_eos?: boolean; /* * Modify the likelihood of a token appearing in the generated text completion. * For example, use "logit_bias": [[15043,1.0]] to increase the likelihood of the token 'Hello', or "logit_bias": [[15043,-1.0]] to decrease its likelihood. * Setting the value to false, "logit_bias": [[15043,false]] ensures that the token Hello is never produced. The tokens can also be represented as strings, * e.g.[["Hello, World!",-0.5]] will reduce the likelihood of all the individual tokens that represent the string Hello, World!, just like the presence_penalty does. * Default: [] / logit_bias?: [number, number | false][]; /* * Set the random number generator (RNG) seed. Default: -1, which is a random seed. */ seed?: number; emit_partial_completion: boolean; }

CompletionBaseParams

{ prompt?: string; messages?: LlamaOAICompatibleMessage[]; chat_template?: string; jinja?: boolean; tools?: Record<string, unknown>; parallel_tool_calls?: Record<string, unknown>; tool_choice?: string; response_format?: CompletionResponseFormat; }

LlamaOAICompatibleMessage

{ // TODO: which values are valid? role: string; // 'user' | 'prompt' | 'model'; content?: string; }

CompletionResponseFormat

{ type: 'text' | 'json_object' | 'json_schema'; json_schema?: { strict?: boolean; schema: object; }; schema?: object; // for json_object type }

TokenCallback

(event: TokenEvent): void

TokenEvent

{ contextId: number; tokenResult: TokenData; }

TokenData

{ token: string; completion_probabilities?: NativeCompletionTokenProb[]; }