@mceachen/sqlite-vec
v0.3.2
Published
A vector search SQLite extension that runs anywhere
Maintainers
Readme
sqlite-vec
[!NOTE] Community Fork Notice: This is a temporary fork of
asg017/sqlite-vec
- Created to merge pending upstream PRs and provide community support while the original author is unavailable.
- Once development resumes on the original repository, users are encouraged to switch back.
- All credit for the original implementation goes to Alex Garcia.
- Credit for creating the fork and merging PRs goes to Vlad Lasky.
This branch includes:
- Hardening (replaced atoi() with strtol(), vendor.sh SHA validation, pinned GHA versions, OIDC releases)
- Update to the latest SQLite release (should be no-op, and backward compatible)
- Node.js release added Alpine/MUSL x64/arm64 and Windows arm64 prebuilds
- Node.js package now includes all prebuilds (so post-install scriptes!)
Feel free to cherry-pick any changes back upstream -- commits are kept small and isolated for this purpose.
An extremely small, "fast enough" vector search SQLite extension that runs
anywhere! A successor to sqlite-vss
[!IMPORTANT]
sqlite-vecis a pre-v1, so expect breaking changes!
- Store and query float, int8, and binary vectors in
vec0virtual tables - Written in pure C, no dependencies, runs anywhere SQLite runs (Linux/MacOS/Windows, in the browser with WASM, Raspberry Pis, etc.)
- Store non-vector data in metadata, auxiliary, or partition key columns
Installing
From Original Package Registries
The original packages on PyPI, npm, RubyGems, and crates.io are maintained by the original author. For the latest features from this fork, see "Installing from This Fork" below.
| Language | Install | More Info | |
| -------------- | ---------------------------------------------------- | ------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Python | pip install sqlite-vec | sqlite-vec with Python | |
| Node.js |
npm install sqlite-vec | sqlite-vec with Node.js | |
| Ruby |
gem install sqlite-vec | sqlite-vec with Ruby | |
| Rust |
cargo add sqlite-vec | sqlite-vec with Rust | |
| Datasette |
datasette install datasette-sqlite-vec | sqlite-vec with Datasette | |
| rqlite |
rqlited -extensions-path=sqlite-vec.tar.gz | sqlite-vec with rqlite | |
|
sqlite-utils | sqlite-utils install sqlite-utils-sqlite-vec | sqlite-vec with sqlite-utils | |
Installing from This Fork
Install directly from GitHub to get the latest features from this community fork.
Available Languages
| Language | Install Latest (main branch) | Install Specific Version |
|----------|------------------------------|--------------------------|
| Go | go get github.com/vlasky/sqlite-vec/bindings/go/cgo@main | go get github.com/vlasky/sqlite-vec/bindings/go/[email protected] |
| Lua | luarocks install lsqlite3 then copy sqlite_vec.lua to your project. See Lua example | Download sqlite_vec.lua at v0.2.4-alpha |
| Node.js | npm install @mceachen/sqlite-vec | npm install @mceachen/[email protected] |
| Python | pip install git+https://github.com/vlasky/sqlite-vec.git | pip install git+https://github.com/vlasky/[email protected] |
| Ruby | gem 'sqlite-vec', git: 'https://github.com/vlasky/sqlite-vec' | gem 'sqlite-vec', git: 'https://github.com/vlasky/sqlite-vec', tag: 'v0.2.4-alpha' |
| Rust | cargo add sqlite-vec --git https://github.com/vlasky/sqlite-vec | cargo add sqlite-vec --git https://github.com/vlasky/sqlite-vec --tag v0.2.4-alpha |
Python Note: Requires Python built with loadable extension support (--enable-loadable-sqlite-extensions). If you encounter an error about extension support not being available:
- Use
uvto create virtual environments (automatically uses system Python which typically has extension support) - Or use system Python instead of pyenv/custom builds
- Or rebuild your Python with
./configure --enable-loadable-sqlite-extensions
Available version tags: See Releases
Build from Source
For direct C usage or other languages:
git clone https://github.com/vlasky/sqlite-vec.git
cd sqlite-vec
./scripts/vendor.sh # Download vendored dependencies
make loadable # Builds dist/vec0.so (or .dylib/.dll)Not Yet Available
- Pre-built binaries via GitHub Releases
- Package registry publications (PyPI, npm, RubyGems, crates.io)
- Datasette/sqlite-utils plugins
For these, use the original packages until this fork's CI/CD is configured.
See the original documentation for detailed usage information.
What's New
See CHANGELOG.md for a complete list of improvements, bug fixes, and merged upstream PRs.
Basic Usage
Vector types: sqlite-vec supports three vector types with different trade-offs:
-- Float vectors (32-bit floating point, most common)
CREATE VIRTUAL TABLE vec_floats USING vec0(embedding float[384]);
-- Int8 vectors (8-bit integers, smaller memory footprint)
CREATE VIRTUAL TABLE vec_int8 USING vec0(embedding int8[384]);
-- Binary vectors (1 bit per dimension, maximum compression)
CREATE VIRTUAL TABLE vec_binary USING vec0(embedding bit[384]);Usage example:
.load ./vec0
create virtual table vec_examples using vec0(
sample_embedding float[8]
);
-- vectors can be provided as JSON or in a compact binary format
insert into vec_examples(rowid, sample_embedding)
values
(1, '[0.279, -0.95, -0.45, -0.554, 0.473, 0.353, 0.784, -0.826]'),
(2, '[-0.156, -0.94, -0.563, 0.011, -0.947, -0.602, 0.3, 0.09]'),
(3, '[-0.559, 0.179, 0.619, -0.987, 0.612, 0.396, -0.319, -0.689]'),
(4, '[0.914, -0.327, -0.815, -0.807, 0.695, 0.207, 0.614, 0.459]'),
(5, '[0.072, 0.946, -0.243, 0.104, 0.659, 0.237, 0.723, 0.155]'),
(6, '[0.409, -0.908, -0.544, -0.421, -0.84, -0.534, -0.798, -0.444]'),
(7, '[0.271, -0.27, -0.26, -0.581, -0.466, 0.873, 0.296, 0.218]'),
(8, '[-0.658, 0.458, -0.673, -0.241, 0.979, 0.28, 0.114, 0.369]'),
(9, '[0.686, 0.552, -0.542, -0.936, -0.369, -0.465, -0.578, 0.886]'),
(10, '[0.753, -0.371, 0.311, -0.209, 0.829, -0.082, -0.47, -0.507]'),
(11, '[0.123, -0.475, 0.169, 0.796, -0.201, -0.561, 0.995, 0.019]'),
(12, '[-0.818, -0.906, -0.781, 0.255, 0.584, -0.156, -0.873, -0.237]'),
(13, '[0.992, 0.058, 0.942, 0.722, -0.977, 0.441, 0.363, 0.074]'),
(14, '[-0.466, 0.282, -0.777, -0.13, -0.093, 0.908, 0.752, -0.473]'),
(15, '[0.001, -0.643, 0.825, 0.741, -0.403, 0.278, 0.218, -0.694]'),
(16, '[0.525, 0.079, 0.557, 0.061, -0.999, -0.352, -0.961, 0.858]'),
(17, '[0.757, 0.663, -0.385, -0.884, 0.756, 0.894, -0.829, -0.028]'),
(18, '[-0.862, 0.521, 0.532, -0.743, -0.049, 0.1, -0.47, 0.745]'),
(19, '[-0.154, -0.576, 0.079, 0.46, -0.598, -0.377, 0.99, 0.3]'),
(20, '[-0.124, 0.035, -0.758, -0.551, -0.324, 0.177, -0.54, -0.56]');
-- Find 3 nearest neighbors using LIMIT
select
rowid,
distance
from vec_examples
where sample_embedding match '[0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]'
order by distance
limit 3;
/*
┌───────┬──────────────────┐
│ rowid │ distance │
├───────┼──────────────────┤
│ 5 │ 1.16368770599365 │
│ 13 │ 1.75137972831726 │
│ 11 │ 1.83941268920898 │
└───────┴──────────────────┘
*/How vector search works: The MATCH operator finds vectors similar to your query vector. In the example above, sample_embedding MATCH '[0.5, ...]' searches for vectors closest to [0.5, ...] and returns them ordered by distance (smallest = most similar).
Note: All vector similarity queries require LIMIT or k = ? (where k is the number of nearest neighbors to return). This prevents accidentally returning too many results on large datasets, since finding all vectors within a distance threshold requires calculating distance to every vector in the table.
Advanced Usage
This fork adds several powerful features for production use:
Distance Constraints for KNN Queries
Filter results by distance thresholds using >, >=, <, <= operators on the distance column:
-- KNN query with distance constraint
-- Requests k=10 neighbors, but only returns those with distance < 1.5
select rowid, distance
from vec_examples
where sample_embedding match '[0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]'
and k = 10
and distance < 1.5
order by distance;
/*
┌───────┬──────────────────┐
│ rowid │ distance │
├───────┼──────────────────┤
│ 5 │ 1.16368770599365 │
└───────┴──────────────────┘
*/
-- KNN query with range constraint: find vectors in a specific distance range
select rowid, distance
from vec_examples
where sample_embedding match '[0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]'
and k = 20
and distance between 1.5 and 2.0
order by distance;
/*
┌───────┬──────────────────┐
│ rowid │ distance │
├───────┼──────────────────┤
│ 13 │ 1.75137972831726 │
│ 11 │ 1.83941268920898 │
│ 7 │ 1.89339029788971 │
│ 8 │ 1.92658650875092 │
│ 10 │ 1.93983662128448 │
└───────┴──────────────────┘
*/Cursor-based Pagination
Instead of using OFFSET (which is slow for large datasets), you can use the last result's distance value as a 'cursor' to fetch the next page. This is more efficient because you're filtering directly rather than skipping rows.
-- First page: get initial results
select rowid, distance
from vec_examples
where sample_embedding match '[0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]'
and k = 3
order by distance;
/*
┌───────┬──────────────────┐
│ rowid │ distance │
├───────┼──────────────────┤
│ 5 │ 1.16368770599365 │
│ 13 │ 1.75137972831726 │
│ 11 │ 1.83941268920898 │
└───────┴──────────────────┘
*/
-- Next page: use last distance as cursor (distance > 1.83941268920898)
select rowid, distance
from vec_examples
where sample_embedding match '[0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]'
and k = 3
and distance > 1.83941268920898
order by distance;
/*
┌───────┬──────────────────┐
│ rowid │ distance │
├───────┼──────────────────┤
│ 7 │ 1.89339029788971 │
│ 8 │ 1.92658650875092 │
│ 10 │ 1.93983662128448 │
└───────┴──────────────────┘
*/Space Reclamation with Optimize
optimize compacts vec shadow tables. To shrink the database file:
-- Before creating vec tables: enable autovacuum and apply it (recommended)
PRAGMA auto_vacuum = FULL; -- or INCREMENTAL
VACUUM; -- activates the setting
-- Use WAL for better concurrency
PRAGMA journal_mode = WAL;After deletes, reclaim space:
-- Compact shadow tables
INSERT INTO vec_examples(vec_examples) VALUES('optimize');
- Flush WAL
PRAGMA wal_checkpoint(TRUNCATE);
-- Reclaim freed pages (if using auto_vacuum=INCREMENTAL)
PRAGMA incremental_vacuum;
-- If you did NOT enable autovacuum, run VACUUM (after checkpoint) to shrink the file.
-- With autovacuum on, VACUUM is optional.
VACUUM;VACUUM should not corrupt vec tables; a checkpoint first is recommended when
using WAL so the rewrite starts from a clean state.
Sponsors
[!NOTE] The sponsors listed below support the original
asg017/sqlite-vecproject by Alex Garcia, not this community fork.
Development of the original sqlite-vec is supported by multiple generous sponsors! Mozilla
is the main sponsor through the new Builders project.
sqlite-vec is also sponsored by the following companies:
As well as multiple individual supporters on Github sponsors!
If your company interested in sponsoring sqlite-vec development, send me an
email to get more info: https://alexgarcia.xyz
See Also
sqlite-ecosystem, Maybe more 3rd party SQLite extensions I've developedsqlite-rembed, Generate text embeddings from remote APIs like OpenAI/Nomic/Ollama, meant for testing and SQL scriptssqlite-lembed, Generate text embeddings locally from embedding models in the.ggufformat
