@cle-does-things/rag-rs
v0.2.1
Published
A Rust-native implementation of the RAG stack
Readme
rag-rs
[!NOTE]
This software is released in
alpha, so you can and should expect bugs. Please take a look at the Roadmap to see the plan to reach the first stable version
A Rust-native implementation of the RAG stack, using:
- PdfExtract for PDF extraction
- memchunk for chunking
- BM25 for embedding
- Qdrant for storing
- async-openai for LLM generation
Moreover, it can be served as an API server, usin:
- Axum as a web server framework
- tower-governor for rate limiting
- tower-http for CORS
- tracing and tracing_subscriber for logging
Installation and Usage
Install with cargo:
cargo install rag-rsInstall with npm:
npm install @cle-does-things/rag-rs@latestload command
Parse, chunk and embed the documents in a given directory, and upload them to a vector store.
Usage
rag-rs load [OPTIONS] --directory <DIRECTORY> --qdrant-url <QDRANT_URL> --collection-name <COLLECTION_NAME>Options
-d, --directory <DIRECTORY>
The path to the directory containing the files for the RAG pipeline. (required)--qdrant-url <QDRANT_URL>
URL for a Qdrant vector store instance. If your Qdrant instance needs an API key, make sure that it is available asQDRANT_API_KEYin your environment. (required)--collection-name <COLLECTION_NAME>
Name of the collection for the Qdrant vector store. (required)--chunk-size <CHUNK_SIZE>
Chunking size. Default:1024--cache-dir <CACHE_DIR>Directory where to cache the parsed file. Default:.rag-rs-cache/--cache-chunk-size <CACHE_CHUNK_SIZE>Chunk size for cached writes. Default:1024 bytes--no-cacheDeactivate read/write from cache. Default: active-h, --help
Print help information.
Example
rag-rs --directory data/ \
--chunk-size 2048 \
--qdrant-url http://localhost:6334 \
--collection-name test-data \
--cache-dir cache/ \
--cache-chunk-size 1048576serve command
Serve the RAG application as an API server.
Usage
rag-rs serve [OPTIONS] --qdrant-url <QDRANT_URL> --collection-name <COLLECTION_NAME>Options
--qdrant-url <QDRANT_URL>
URL of your Qdrant instance. If your Qdrant instance needs an API key, make sure that it is available asQDRANT_API_KEYin your environment. (required)--collection-name <COLLECTION_NAME>
Name of the collection for the Qdrant vector store. (required)--openai-api-key <OPENAI_API_KEY>
OpenAI API key. It is not advised to pass the key as an option to the CLI command: you should set it as theOPENAI_API_KEYenvironment variable.-p, --port <PORT>
Port for the server to run on. Default:8000--host <HOST>
Host for the server to run on. Default:0.0.0.0--rate-limit-per-minute <RATE_LIMIT_PER_MINUTE>
Request rate limit per minute. Default:100--cors <CORS>
Allowed CORS origin (e.g.https://mydomain.com). Default:*(all origins allowed). While this argument has no effect for local development, it is advisable to set it for production deployments.--log-level <LOG_LEVEL>
Logging level. Default:info
Available values:info,debug,error,warning,trace--log-json
Whether or not to activate JSON logging. Default:false(uses compact logging by default)-h, --help
Print help information.
Example
rag-rs --qdrant-url http://localhost:6334 \
--collection-name test-data \
--host 127.0.0.1 \
--port 3000 \
--rate-limit-per-minute 30 \
--cors "http://mydomain.com" \
--log-leve info \
--log-jsonLimitations
- Currently supports only
.pdf,.txtand.mdfiles - Does not go through the data directory recursively
- PDF extraction accounts only for text
Roadmap
To reach the first stable version, this software will first:
- [X] Add a caching layer (v0.2.0-alpha)
- [X] Introduce thorough testing (v0.2.1)
- [X] Add an NPM-installable version (v0.2.1)
Moreover, for future releases, there will be:
- [ ] A programmatic API along with the CLI app, possibly both in Rust and Python
- [ ] Support for more text-based file formats, and possibly for more unstructured file formats
