hikma-engine

v3.0.0

Published

5 months ago

Code Knowledge Graph Indexer - A sophisticated TypeScript-based indexer that transforms Git repositories into multi-dimensional knowledge stores for AI agents

0High
0Medium
0Low

foyzulkarim

code-analysis knowledge-graph ast-parsing git-analysis vector-embeddings ai-indexing typescript polyglot-persistence

Hikma Engine

A TypeScript-based code knowledge graph indexer that transforms Git repositories into searchable knowledge stores for AI agents. Creates interconnected representations of codebases through AST parsing and vector embeddings.

Features

AST-based code structure extraction: Deep understanding of code relationships
Vector embeddings: Semantic similarity search with multiple providers
Configurable LLM providers: Support for local Python models, OpenAI API, and local services (LM Studio, Ollama)
Python ML integration: Advanced embedding models via Python bridge
Intelligent fallback system: Automatic fallback between providers for reliability
Comprehensive monitoring: Request tracking, performance metrics, and error analysis
Unified CLI: Single hikma-engine command for all operations (embed, search, rag)
SQLite storage: Unified storage with sqlite-vec extension

Installation

Prerequisites

Node.js >= 20.0.0
Git repository for indexing
Python 3.10+

Clone and run the project

# Clone repository
git clone https://github.com/foyzulkarim/hikma-engine
cd hikma-engine

# Install dependencies
npm install

For python provider, set up Python dependencies:

# After installing hikma-engine
npm run setup-python

CLI Usage

Hikma Engine provides three main commands: embed, search, and rag with an explicit CLI approach that requires no configuration files.

Key Features

No .env dependencies: All configuration is explicit via CLI flags
Required provider: --provider is mandatory for all commands
NPX-friendly: Works perfectly with npx without local installation
Self-documenting: All options are visible and explicit
Scriptable: Perfect for CI/CD and automation

Quick Examples

Using Python Provider (Local Models)

# Embed with Python provider
npm run embed -- --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1"

# Search with Python provider
npm run search -- "database configuration" --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1"

# RAG with Python provider
npm run rag -- "How does authentication work?" --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1" --llm-model "Qwen/Qwen2.5-Coder-1.5B-Instruct"

Using Server Provider (Ollama/LM Studio)

# Embed with Ollama
npm run embed -- --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest

# Search with Ollama
npm run search -- "database configuration" --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest

# RAG with Ollama
npm run rag -- "How does authentication work?" --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest --llm-model qwen2.5-coder:7b --max-tokens 3000

Using NPX (No Local Installation)

# Works anywhere without installing hikma-engine locally
npx hikma-engine embed --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1" --dir /path/to/project
npx hikma-engine search "authentication" --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest --dir /path/to/project
npx hikma-engine rag "How does this work?" --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest --llm-model qwen2.5-coder:7b --dir /path/to/project

Required and Optional Flags

Required for All Commands

--provider <python|server|local|transformers>: REQUIRED - Specifies the AI provider to use

Required for Server Provider

--server-url <url>: REQUIRED when using --provider server - Base URL for OpenAI-compatible server

Common Optional Flags

--dir <path>: Project directory (defaults to current directory)
--embedding-model <model>: Override default embedding model
--llm-model <model>: Override default LLM model (for rag command)
--install-python-deps: Auto-install Python dependencies when using Python provider

Command-Specific Flags

embed: --force-full, --skip-embeddings
search: --limit <n>, --min-similarity <0..1>
rag: --top-k <n>, --max-tokens <n>

Intelligent Defaults

When you specify a provider, Hikma Engine automatically selects appropriate default models:

Python provider: mixedbread-ai/mxbai-embed-large-v1 (embedding), Qwen/Qwen2.5-Coder-1.5B-Instruct (LLM)
Server provider: text-embedding-ada-002 (embedding), gpt-3.5-turbo (LLM)
Local provider: Xenova/all-MiniLM-L6-v2 (embedding), Xenova/gpt2 (LLM)
Transformers provider: Xenova/all-MiniLM-L6-v2 (embedding)

Directory Handling

Each project gets its own SQLite database stored in the project directory. You can work with multiple projects simultaneously:

# Index project A
npm run embed -- --provider python --dir /path/to/project-a

# Index project B
npm run embed -- --provider python --dir /path/to/project-b

# Search in specific project
npm run search -- "authentication" --provider python --dir /path/to/project-a

Configuration

Explicit CLI Approach (Recommended)

Hikma Engine now uses an explicit CLI approach that requires no configuration files. All settings are specified directly via command-line flags:

# Everything is explicit - no hidden configuration
npm run embed -- --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1"
npm run search -- "query" --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest
npm run rag -- "question" --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest --llm-model qwen2.5-coder:7b

Benefits:

✅ No .env files needed - Everything is explicit
✅ NPX-friendly - Works without local installation
✅ Self-documenting - All options are visible
✅ Scriptable - Perfect for CI/CD pipelines
✅ No hidden state - What you see is what you get

Legacy Environment Variables (Optional)

For backward compatibility, you can still use environment variables by copying .env.example to .env. However, CLI flags take precedence over environment variables.

cp .env.example .env

Main Configuration

HIKMA_LOG_LEVEL: Logging level (debug, info, warn, error). Default: info
HIKMA_SQLITE_PATH: SQLite database path. Default: ./data/metadata.db
HIKMA_SQLITE_VEC_EXTENSION: sqlite-vec extension path. Default: ./extensions/vec0.dylib

Legacy AI Configuration

These are only used when CLI flags are not provided:

Embedding Configuration:

HIKMA_EMBEDDING_PROVIDER: Provider (python, openai). Default: python
HIKMA_EMBEDDING_MODEL: Model for Python provider
HIKMA_EMBEDDING_OPENAI_API_URL: Server URL for OpenAI-compatible APIs
HIKMA_EMBEDDING_OPENAI_API_KEY: API key (optional for local services)
HIKMA_EMBEDDING_OPENAI_MODEL: Model name for server provider

LLM Configuration:

HIKMA_ENGINE_LLM_PROVIDER: Provider (python, openai). Default: python
HIKMA_ENGINE_LLM_PYTHON_MODEL: Python model name
HIKMA_ENGINE_LLM_OPENAI_API_URL: Server URL
HIKMA_ENGINE_LLM_OPENAI_API_KEY: API key
HIKMA_ENGINE_LLM_OPENAI_MODEL: Model name
HIKMA_ENGINE_LLM_OPENAI_MAX_TOKENS: Max response tokens. Default: 400
HIKMA_ENGINE_LLM_OPENAI_TEMPERATURE: Sampling temperature. Default: 0.6

Example for Ollama:

HIKMA_EMBEDDING_PROVIDER=openai
HIKMA_EMBEDDING_OPENAI_API_URL=http://localhost:11434
HIKMA_EMBEDDING_OPENAI_MODEL=mxbai-embed-large:latest

Example for LM Studio embeddings:

HIKMA_EMBEDDING_PROVIDER=openai
HIKMA_EMBEDDING_OPENAI_API_URL=http://localhost:1234
HIKMA_EMBEDDING_OPENAI_MODEL=text-embedding-mxbai-embed-large-v1

RAG Configuration

HIKMA_RAG_MODEL: The RAG model for code explanation. Default: Qwen/Qwen2.5-Coder-1.5B-Instruct.

LLM Provider Configuration

HIKMA_ENGINE_LLM_PROVIDER: The LLM provider for code explanations. Options: python, openai. Default: python.
HIKMA_ENGINE_LLM_TIMEOUT: Request timeout in milliseconds. Default: 300000.
HIKMA_ENGINE_LLM_RETRY_ATTEMPTS: Number of retry attempts. Default: 3.
HIKMA_ENGINE_LLM_RETRY_DELAY: Delay between retries in milliseconds. Default: 1000.

Python Provider

When HIKMA_ENGINE_LLM_PROVIDER=python:

HIKMA_ENGINE_LLM_PYTHON_MODEL: The model to use. Default: Qwen/Qwen2.5-Coder-1.5B-Instruct.
HIKMA_ENGINE_LLM_PYTHON_MAX_RESULTS: Max results for the model. Default: 8.

OpenAI Provider

When HIKMA_ENGINE_LLM_PROVIDER=openai (for OpenAI API or other compatible services like LM Studio/Ollama; server in CLI):

HIKMA_ENGINE_LLM_OPENAI_API_URL: The API endpoint.
HIKMA_ENGINE_LLM_OPENAI_API_KEY: Your API key.
HIKMA_ENGINE_LLM_OPENAI_MODEL: The model name.
HIKMA_ENGINE_LLM_OPENAI_MAX_TOKENS: (Optional) Max tokens for the response. Default: 400.
HIKMA_ENGINE_LLM_OPENAI_TEMPERATURE: (Optional) Sampling temperature. Default: 0.6.

Example for OpenAI API:

HIKMA_ENGINE_LLM_PROVIDER=openai
HIKMA_ENGINE_LLM_OPENAI_API_URL=https://api.openai.com/v1/chat/completions
HIKMA_ENGINE_LLM_OPENAI_API_KEY=sk-your-openai-api-key-here
HIKMA_ENGINE_LLM_OPENAI_MODEL=gpt-4

Example for local services (LM Studio, Ollama):

HIKMA_ENGINE_LLM_PROVIDER=openai
HIKMA_ENGINE_LLM_OPENAI_API_URL=http://localhost:1234 # For LM Studio (base URL; endpoint inferred)
# HIKMA_ENGINE_LLM_OPENAI_API_URL=http://localhost:11434 # For Ollama (base URL; endpoint inferred)
HIKMA_ENGINE_LLM_OPENAI_API_KEY=not-needed-for-local
HIKMA_ENGINE_LLM_OPENAI_MODEL=your-local-model

Embedding Providers

Hikma Engine supports multiple embedding providers. The default is python, but server-based (OpenAI-compatible) is fully supported and recommended for npx/global usage.

| Provider | Description | Examples | Setup Required | Status | |----------|-------------|----------|----------------|--------| | openai (server) | OpenAI-compatible HTTP API for embeddings | Ollama (http://localhost:11434), LM Studio (http://localhost:1234) | Run server; optional API key | Supported | | python | Python-based embeddings using local models | Hugging Face transformers via Python | Python 3.8+ and pip deps | Supported (default) | | transformers | In-process JS embeddings via @xenova/transformers | Browser/Node, no server | None | Supported |

LLM Providers

Hikma Engine supports multiple LLM providers for generating code explanations:

| Provider | Description | Use Case | Setup Required | Status | |----------|-------------|----------|----------------|--------| | python | Local Python-based LLM using transformers | Privacy, offline usage, no API costs | Python + pip dependencies | Supported (default) | | openai | OpenAI API or compatible services | High-quality responses, cloud-based | API key required | Supported |

Local Services Integration

You can use local AI services for both embeddings and LLM. Here are tested working configurations:

Explicit CLI Approach (Recommended)

Using Ollama:

# Start Ollama and pull models
ollama serve
ollama pull mxbai-embed-large:latest
ollama pull qwen2.5-coder:7b

# Use with explicit CLI flags
npm run embed -- --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest
npm run search -- "query" --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest
npm run rag -- "question" --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest --llm-model qwen2.5-coder:7b

Using LM Studio:

# Start LM Studio on http://localhost:1234 and load models

# Use with explicit CLI flags
npm run embed -- --provider server --server-url http://localhost:1234 --embedding-model text-embedding-mxbai-embed-large-v1
npm run search -- "query" --provider server --server-url http://localhost:1234 --embedding-model text-embedding-mxbai-embed-large-v1
npm run rag -- "question" --provider server --server-url http://localhost:1234 --embedding-model text-embedding-mxbai-embed-large-v1 --llm-model openai/gpt-oss-20b

Legacy Environment Variables (Optional)

Using LM Studio + Ollama:

# .env configuration
HIKMA_EMBEDDING_PROVIDER=openai
HIKMA_EMBEDDING_OPENAI_API_URL=http://localhost:11434
HIKMA_EMBEDDING_OPENAI_MODEL=mxbai-embed-large:latest

HIKMA_ENGINE_LLM_PROVIDER=openai
HIKMA_ENGINE_LLM_OPENAI_API_URL=http://localhost:1234/v1/chat/completions
HIKMA_ENGINE_LLM_OPENAI_API_KEY=not-needed-for-local
HIKMA_ENGINE_LLM_OPENAI_MODEL=openai/gpt-oss-20b

Using Only Ollama:

# .env configuration
HIKMA_EMBEDDING_PROVIDER=openai
HIKMA_EMBEDDING_OPENAI_API_URL=http://localhost:11434
HIKMA_EMBEDDING_OPENAI_MODEL=mxbai-embed-large:latest

HIKMA_ENGINE_LLM_PROVIDER=openai
HIKMA_ENGINE_LLM_OPENAI_API_URL=http://localhost:11434/v1/chat/completions
HIKMA_ENGINE_LLM_OPENAI_API_KEY=not-needed-for-local
HIKMA_ENGINE_LLM_OPENAI_MODEL=gpt-oss:20b

Model Requirements

For Ollama:

Embedding models: mxbai-embed-large:latest
LLM models: gpt-oss:20b, qwen2.5-coder:7b, or similar
Install models: ollama pull mxbai-embed-large:latest && ollama pull gpt-oss:20b

For LM Studio:

Embedding models: text-embedding-mxbai-embed-large-v1, text-embedding-nomic-embed-text-v1.5
LLM models: openai/gpt-oss-20b, qwen/qwen3-coder-30b, or similar
Load models through LM Studio interface

Quick Start

Option 1: NPX (No Installation Required)

# Index your codebase with Python provider
npx hikma-engine embed --provider python --dir /path/to/your/project

# Search for code
npx hikma-engine search "authentication logic" --provider python --dir /path/to/your/project

# Get AI explanations
npx hikma-engine rag "how does authentication work?" --provider python --dir /path/to/your/project

Option 2: Local Installation

Install and setup:

npm install
npm run build          # Build the TypeScript code
npm rebuild            # Rebuild native dependencies if needed
npm run setup-python   # For Python-based features (optional)

Index your codebase (explicit CLI - no .env needed):

# Using Python provider (local models)
npm run embed -- --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1"
   
# OR using server provider (Ollama/LM Studio)
npm run embed -- --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest

Search and get explanations:

# Search with explicit provider
npm run search -- "authentication logic" --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1"
   
# RAG with explicit provider
npm run rag -- "how does user authentication work?" --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1" --llm-model "Qwen/Qwen2.5-Coder-1.5B-Instruct"

Important Notes

Build Required: You must run npm run build after installation to compile TypeScript code
Native Dependencies: If you encounter SQLite errors, run npm rebuild to recompile native modules
Provider Fallback: The system automatically falls back between providers if one fails
Database Location: The SQLite database is created in the data/ directory (configurable with --db-path)
Explicit CLI: The --provider flag is now required for all commands - no more hidden .env dependencies
Server Provider: When using --provider server, you must also specify --server-url

Testing Your Setup

To verify everything is working correctly with the explicit CLI approach:

# 1. Test embedding (indexing) with Python provider
npm run embed -- --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1"
# Should show: "✅ Embedding completed successfully!"

# 2. Test search functionality
npm run search -- "CLI commands" --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1" --limit 5
# Should return relevant code snippets with similarity scores

# 3. Test RAG (AI explanation)
npm run rag -- "How do the CLI commands work?" --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1" --llm-model "Qwen/Qwen2.5-Coder-1.5B-Instruct"
# Should provide an AI-generated explanation based on your code

# 4. Test with npx (global usage)
npx hikma-engine search "database" --provider python --dir . --limit 3
# Should work without local installation

# 5. Test server provider (if you have Ollama running)
npm run embed -- --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest
npm run search -- "authentication" --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest

Expected Results:

Embedding: Creates database file in data/ directory with indexed code
Search: Returns table with Node ID, Type, File Path, Similarity %, and Source Text Preview
RAG: Provides detailed AI explanation with code context
All commands should complete without errors and show cleanup logs

Troubleshooting

Common Issues

SQLite Module Version Error:

# Error: The module was compiled against a different Node.js version
npm rebuild

OpenAI API Key Error:

# Error: Incorrect API key provided
# Solution 1: Use explicit CLI flags (recommended)
npm run rag -- "question" --provider openai --openai-api-key sk-your-actual-api-key --embedding-model text-embedding-3-small --llm-model gpt-4o-mini

# Solution 2: Update your .env file (legacy approach)
HIKMA_EMBEDDING_OPENAI_API_KEY=sk-your-actual-api-key
HIKMA_ENGINE_LLM_OPENAI_API_KEY=sk-your-actual-api-key

Local Service Connection Error:

# Check if Ollama is running and accessible
curl -s http://localhost:11434/api/tags
ollama list  # List available models

# If Ollama is not running:
ollama serve

# Test with explicit CLI flags:
npm run search -- "test" --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest

# Check if LM Studio is running and accessible
curl -s http://localhost:1234/v1/models

# Test with explicit CLI flags:
npm run search -- "test" --provider server --server-url http://localhost:1234 --embedding-model text-embedding-mxbai-embed-large-v1

# Ensure LM Studio is running on port 1234 with a model loaded

"No healthy providers available" Error:

# This usually means the LLM service is not accessible or the model is not available

# Solution 1: Use explicit CLI flags to test (recommended)
# Test different providers:
npm run rag -- "test" --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1" --llm-model "Qwen/Qwen2.5-Coder-1.5B-Instruct"
npm run rag -- "test" --provider server --server-url http://localhost:11434 --embedding-model mxbai-embed-large:latest --llm-model qwen2.5-coder:7b

# Solution 2: Check your .env configuration (legacy approach):
# 1. Verify the API URL is correct
# 2. Ensure the model name matches exactly what's available
# 3. Test the service manually:
curl -s http://localhost:1234/v1/models  # For LM Studio
ollama list  # For Ollama

# 4. Try switching between services if one fails

Model Runner Stopped Error (Ollama):

# If you get "model runner has unexpectedly stopped"
# This usually indicates resource limitations or model issues

# Solution 1: Try explicit CLI with smaller model or different provider (recommended)
npm run rag -- "question" --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1" --llm-model "Qwen/Qwen2.5-Coder-1.5B-Instruct"
# Or switch to LM Studio:
npm run rag -- "question" --provider server --server-url http://localhost:1234 --embedding-model text-embedding-mxbai-embed-large-v1 --llm-model openai/gpt-oss-20b

# Solution 2: Update .env file (legacy approach)
HIKMA_ENGINE_LLM_OPENAI_API_URL=http://localhost:1234/v1/chat/completions
HIKMA_ENGINE_LLM_OPENAI_MODEL=openai/gpt-oss-20b

Python Dependencies Missing:

# Install Python dependencies for local LLM/embedding
npm run setup-python

CLI Command Not Found:

# Build the project first
npm run build

# Then use with explicit provider flags
npm run embed -- --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1"

# Or use npx for global access (no installation required)
npx hikma-engine embed --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1"
npx hikma-engine search "query" --provider python --embedding-model "mixedbread-ai/mxbai-embed-large-v1"

License

MIT License - see LICENSE file for details.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Hikma Engine

Features

Installation

Prerequisites

Clone and run the project

CLI Usage

Key Features

Quick Examples

Using Python Provider (Local Models)

Using Server Provider (Ollama/LM Studio)

Using NPX (No Local Installation)

Required and Optional Flags

Required for All Commands

Required for Server Provider

Common Optional Flags

Command-Specific Flags

Intelligent Defaults

Directory Handling

Configuration

Explicit CLI Approach (Recommended)

Legacy Environment Variables (Optional)

Main Configuration

Legacy AI Configuration

RAG Configuration

LLM Provider Configuration

Python Provider

OpenAI Provider

Embedding Providers

LLM Providers

Local Services Integration

Explicit CLI Approach (Recommended)

Legacy Environment Variables (Optional)

Model Requirements

Quick Start

Option 1: NPX (No Installation Required)

Option 2: Local Installation

Important Notes

Testing Your Setup

Troubleshooting

Common Issues

License