datomsdb
v0.3.0
Published
A distributed encrypted database built on datoms
Readme
DatomsDB
DatomsDB is a lightweight, verifiable, distributed, and encrypted database system built upon the datoms append-only log structure. It offers a native JavaScript API, making it ideal for applications demanding high data integrity, auditability, and optional end-to-end encryption.
DatomsDB provides a robust foundation for modern data challenges, including:
- AI + Data: Ensuring the provenance and integrity of data used for training and operating AI models.
- Agent Systems: Providing verifiable data trails for autonomous agent actions and interactions.
- Auditable Systems: Creating immutable logs for compliance, security, and debugging.
- Decentralized Applications: Offering a flexible, locally-managed database with inherent verifiability.
✨ Features
- 🔐 Verifiable & Immutable: Leverages the
datomsappend-only log for inherent data integrity and history tracking. - 🚀 Native JavaScript API: Simple and direct interaction using familiar JavaScript patterns (
async/await). - 🧱 Schema Enforcement: Define table structures and enforce data types.
- 🛡️ Optional End-to-End Encryption: Built-in support for RSA/AES hybrid encryption to secure data at rest.
- 🔄 Transaction Support: Basic atomic operations for data consistency.
- ⚡ Indexing: Create and manage indexes for faster query performance.
- ⚙️ Metadata Management: Automatic handling of database and table metadata.
- 🔧 Tooling: Includes utilities for data mocking and browsing.
📚 Table of Contents
- Installation
- 🚀 Quick Start
- core concepts
- 📖 Usage Examples
- 🔐 Encryption
- 🛠️ Tools
- 💾 Data Storage Structure
- 🔌 API Reference (Summary)
- 🤝 Contributing
- 📄 License
Installation
npm install datomsdb🚀 Quick Start
Get up and running with DatomsDB in a few simple steps:
const DatomsDB = require('datomsdb');
async function run() {
// 1. Create or open a database instance
const db = new DatomsDB({
storagePath: './my-data', // Path to store database files
debug: false // Enable verbose logging if needed
});
// 2. Initialize the database
await db.init();
console.log('Database initialized.');
try {
// 3. Create a table (if it doesn't exist)
await db.client.createDatoms('users', {
schema: {
id: 'number',
username: 'string',
email: 'string',
isActive: 'boolean'
}
});
console.log('Table "users" created or already exists.');
// 4. Append some data
const userId = await db.client.append('users', {
id: 1,
username: 'alice',
email: '[email protected]',
isActive: true
});
console.log(`Appended user with internal ID: ${userId}`);
// 5. Query the data
const activeUsers = await db.client.queryDatoms('users', { isActive: true });
console.log('Active users:');
console.table(activeUsers);
} catch (error) {
console.error('An error occurred:', error);
} finally {
// 6. Close the database connection
await db.close();
console.log('Database closed.');
}
}
run();💡 Core Concepts
What are Datoms?
DatomsDB is built on datoms, which are immutable, timestamped facts representing additions or changes to your data. Think of your database not as mutable tables, but as an append-only log of facts like [entity_id, attribute, value, transaction_id, added?]. This structure provides a full history of changes automatically.
Verifiability
Because the underlying storage is an append-only log (like a Hypercore feed), each piece of data is cryptographically linked to the previous one. This creates a tamper-evident structure. You can verify the integrity of the entire dataset or specific parts of it, ensuring data hasn't been maliciously altered.
📖 Usage Examples
Initializing the Database
const DatomsDB = require('datomsdb');
const path = require('path');
const db = new DatomsDB({
storagePath: path.join(__dirname, 'data')
});
await db.init();Creating a Table
Define the schema for your data.
await db.client.createDatoms('products', {
schema: {
productId: 'string', // Changed to string for potential UUIDs
name: 'string',
price: 'number',
category: 'string',
inStock: 'boolean'
}
});Appending Data (Create)
Add new records to a table. The append method returns the internal _id of the new record.
const productRecordId = await db.client.append('products', {
productId: 'uuid-12345',
name: 'Super Widget',
price: 29.99,
category: 'Widgets',
inStock: true
});
console.log('Product appended with record ID:', productRecordId);Querying Data (Read)
Retrieve records based on filter criteria. An empty filter {} retrieves all records.
// Find all products in the 'Widgets' category
const widgets = await db.client.queryDatoms('products', { category: 'Widgets' });
console.table(widgets);
// Find a specific product by its ID
const specificProduct = await db.client.queryDatoms('products', { productId: 'uuid-12345' });
console.table(specificProduct);Updating Data
DatomsDB uses an append-only model. Updates create a new version of the record, preserving the old one. Use queryDatoms to find the record's internal _id first.
const recordsToUpdate = await db.client.queryDatoms('products', { productId: 'uuid-12345' });
if (recordsToUpdate.length > 0) {
const currentRecord = recordsToUpdate[0]; // Assuming productId is unique
const updatedRecordId = await db.client.appendNewVersion('products', currentRecord._id, {
...currentRecord, // Spread existing data
price: 25.99 // Apply the change
});
console.log('Product updated. New version record ID:', updatedRecordId);
}Deleting Data (Mark as Deleted)
Deletions are also append-only operations. Records are marked as deleted but remain in the history.
const recordsToDelete = await db.client.queryDatoms('products', { productId: 'uuid-12345' });
if (recordsToDelete.length > 0) {
const deletedRecordId = await db.client.markAsDeleted('products', recordsToDelete[0]._id);
console.log('Product marked as deleted. Deletion record ID:', deletedRecordId);
}Note: Standard queryDatoms calls typically exclude records marked as deleted.
Transactions
Perform multiple operations atomically.
const txId = await db.client.transactionManager.beginTransaction();
try {
await db.client.append('users', { id: 2, username: 'bob', email: '[email protected]', isActive: true });
// ... other operations ...
await db.client.transactionManager.commitTransaction(txId);
console.log('Transaction committed');
} catch (err) {
await db.client.transactionManager.rollbackTransaction(txId);
console.error('Transaction rolled back:', err);
}Indexing
Create indexes to speed up queries on specific fields.
// Create an index on the 'email' field of the 'users' table
await db.client.indexManager.createIndex('users', 'email', 'idx_users_email');
console.log('Index created on users.email');
// Query using the index (example, specific method might vary)
// Note: Querying via index often uses specific IndexManager methods
const usersByEmail = await db.client.indexManager.queryByIndex(
'users',
'idx_users_email', // Index name
'[email protected]' // Value to search for
);
console.log('Users found by email index:', usersByEmail);🔐 Encryption
DatomsDB includes DatomsDBEncryption for robust, end-to-end encryption of your data using a hybrid RSA + AES approach.
Overview
- Hybrid Encryption: Data is encrypted with a unique AES key per record (or batch). This AES key is then encrypted using an RSA public key specific to the datom collection.
- Key Management: Each encrypted datom collection gets its own RSA key pair (
keyfor public,secret_keyfor private). - Signatures: Data is automatically signed using the private key upon insertion, and signatures can be verified using the public key to ensure integrity.
Initializing an Encrypted Database
const DatomsDBEncryption = require('datomsdb/DatomsDBEncryption'); // Adjust path if needed
const encryptedDB = new DatomsDBEncryption({
storagePath: './encrypted-data', // Separate directory recommended
encryption: {
enabled: true, // Default is true
// keySize: 2048, // Default RSA key size
// aesKeySize: 256 // Default AES key size
}
});
await encryptedDB.init();
console.log('Encrypted database initialized.');
// Key pairs are generated automatically if they don't existWorking with Encrypted Data
The API mirrors the standard client but uses encryption/decryption automatically.
// 1. Create an encrypted table
await encryptedDB.createEncryptedDatoms('secrets', {
schema: { secretId: 'string', value: 'string' }
});
// 2. Append encrypted data
await encryptedDB.appendEncrypted('secrets', {
secretId: 's1',
value: 'This is highly sensitive'
});
// 3. Query and decrypt data
const secrets = await encryptedDB.queryEncrypted('secrets', { secretId: 's1' });
console.log('Decrypted secret:', secrets[0].value); // Output: This is highly sensitive
// 4. Verify data integrity (optional)
const isValid = await encryptedDB.verifySignature('secrets');
console.log('Data signature valid:', isValid);
await encryptedDB.close();Key Management
Manage the RSA keys associated with your encrypted datoms.
// Get keys (use with caution - exposes private key)
// const keys = await encryptedDB.getKeys('secrets');
// Export keys for backup
// const exported = await encryptedDB.exportKeys('secrets');
// fs.writeFileSync('secrets_keys.json', JSON.stringify(exported));
// Import keys (e.g., for recovery)
// const imported = JSON.parse(fs.readFileSync('secrets_keys.json'));
// await encryptedDB.importKeys('secrets', imported);
// Rotate keys (generates new keys, re-encrypts data - can be slow)
// await encryptedDB.rotateKeys('secrets');
// console.log('Keys rotated for secrets datom.');Example: Storing Encrypted Images
DatomsDBEncryption can handle binary data (like images) by encoding it (e.g., Base64) before encryption.
// (Inside an async function initialized with encryptedDB)
const imagePath = './path/to/your/image.jpg';
const imageBuffer = fs.readFileSync(imagePath);
const imageBase64 = imageBuffer.toString('base64');
const imageId = crypto.randomUUID();
await encryptedDB.createEncryptedDatoms('secure_images', {
schema: { id: 'string', name: 'string', type: 'string', data: 'string' } // data stores base64
});
await encryptedDB.appendEncrypted('secure_images', {
id: imageId,
name: path.basename(imagePath),
type: 'image/jpeg', // Adjust MIME type
data: imageBase64
});
console.log(`Image ${imageId} stored securely.`);
// To retrieve: Query, get the base64 data, decode it
const results = await encryptedDB.queryEncrypted('secure_images', { id: imageId });
if (results.length > 0) {
const recoveredBuffer = Buffer.from(results[0].data, 'base64');
fs.writeFileSync('./recovered_image.jpg', recoveredBuffer);
console.log('Image recovered successfully.');
}(For the full, detailed example script demonstrating verification and key rotation, see examples/encrypted-image-datoms.js in the repository.)
Security Considerations
- Private Key Security: Protect the
secret_keyfile diligently. Anyone with access can decrypt data. Consider file system permissions or external key management systems (HSMs) for production. - Key Rotation: Regularly rotate keys (
rotateKeys) to mitigate the risk of compromised keys. Be aware this is computationally intensive as it re-encrypts all data. - Backups: Ensure your backup strategy includes both the encrypted data directories and the corresponding exported keys. Test your recovery process.
🛠️ Tools
DatomsDB comes with helpful command-line or scriptable tools.
Data Mocking Utility
Generate sample data for testing and development. Configure templates defining schemas, counts, and generator functions.
(See examples/mock-datomsDB-with-package.js for the full script and usage.)
Data Browser Utility
Inspect the tables (datoms) and view sample data within your DatomsDB instance.
(See examples/browse-datoms.js for the full script and usage.)
💾 Data Storage Structure
Standard Structure
DatomsDB organizes data on the file system:
your-storage-path/
├── datoms/ # Contains individual datom feeds (tables)
│ ├── users_timestamp/ # Data feed for the 'users' table
│ │ ├── data # Main data blocks
│ │ ├── signatures # Signatures for blocks
│ │ ├── tree # Merkle tree structure
│ │ └── ... # Other Hypercore/datoms files
│ └── products_timestamp/ # Data feed for 'products'
│ └── ...
├── datoms-registry.json # Index of all datoms, schemas, paths
├── hashKey/ # Symlinks mapping content hashes to datoms paths (optional)
├── consensus/ # Data related to consensus (if used)
└── transactions/ # Transaction logsEncrypted Structure
Encrypted datoms have additional files for keys and store encrypted data:
your-encrypted-storage-path/
├── datoms/
│ ├── secrets_timestamp/
│ │ ├── key # RSA Public Key (PEM format)
│ │ ├── secret_key # RSA Private Key (PEM format) - PROTECT THIS!
│ │ ├── data # Encrypted data blocks (AES encrypted)
│ │ ├── signatures # Signatures generated with the private key
│ │ └── ... # Other Hypercore/datoms files (bitfield, tree, etc.)
│ └── ...
├── datoms-registry.json # Registry marks datoms as encrypted
└── ... # Other directories (consensus, transactions)🔌 API Reference (Summary)
(This is a high-level overview. Refer to the source code or generated documentation for full details.)
DatomsDB Class
The main entry point for interacting with a database instance.
const DatomsDB = require('datomsdb');
const db = new DatomsDB(options);
// Key Methods:
await db.init(); // Initialize connection and load datoms
await db.close(); // Close connection
db.client; // Access the DatomsDBClient instanceDatomsDBClient
Handles operations on datoms (tables). Accessed via db.client.
// Key Methods:
await db.client.createDatoms(name, options); // Create a table with schema
await db.client.append(name, data); // Add a new record
await db.client.appendNewVersion(name, oldId, data); // Create an updated version
await db.client.markAsDeleted(name, id); // Mark a record as deleted
const results = await db.client.queryDatoms(name, filter); // Query records
db.client.transactionManager; // Access transaction methods
db.client.indexManager; // Access index methods
db.client.datomsEngine; // Lower-level access (use with caution)DatomsDBEncryption Class
Handles encrypted database instances and operations.
const DatomsDBEncryption = require('datomsdb/DatomsDBEncryption');
const encryptedDB = new DatomsDBEncryption(options);
// Key Methods (Encryption specific):
await encryptedDB.init();
await encryptedDB.close();
await encryptedDB.createEncryptedDatoms(name, options); // Create encrypted table
await encryptedDB.appendEncrypted(name, data); // Add encrypted record
const results = await encryptedDB.queryEncrypted(name, options); // Query and decrypt
await encryptedDB.rotateKeys(name); // Rotate encryption keys
const isValid = await encryptedDB.verifySignature(name); // Verify data integrity
// ... other methods for key import/export, direct encrypt/decrypt/sign/verify🤝 Contributing
Contributions are welcome! Please follow these steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature/your-feature-name). - Make your changes.
- Add tests for your changes.
- Ensure all tests pass (
npm test). - Commit your changes (
git commit -am 'Add some feature'). - Push to the branch (
git push origin feature/your-feature-name). - Open a Pull Request.
Please report bugs or suggest features using the GitHub Issues tab.
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
故障排除
常见问题
sodium-native 编译错误
如果遇到 sodium-native 相关的编译或加载错误:
# 1. 清理并重新安装依赖
rm -rf node_modules package-lock.json
npm install
# 2. 如果仍有问题,尝试重新编译原生模块
npm rebuild sodium-native
# 3. 确保系统有必要的构建工具
# Ubuntu/Debian:
sudo apt-get install build-essential python3
# macOS:
xcode-select --install
# Windows:
npm install --global windows-build-tools权限错误
确保数据目录有写权限:
chmod 755 ./data"Cannot find addon" 错误
这通常是由于 sodium-native 版本冲突导致的。项目已通过 overrides 字段解决此问题,但如果仍遇到问题:
# 检查依赖树
npm ls sodium-native
# 应该看到所有版本都统一为 5.0.x 并标记为 "deduped" 或 "overridden"性能优化建议
大数据集处理
- 使用流式 API 处理大文件
- 实现分页查询避免内存溢出
- 考虑为频繁查询的字段创建索引
内存使用
- 定期关闭不使用的数据库连接
- 监控内存使用情况,特别是在处理大量数据时
// 示例:正确关闭数据库
async function cleanup() {
await db.close()
console.log('Database closed properly')
}
process.on('SIGINT', cleanup)
process.on('SIGTERM', cleanup)系统要求
构建环境要求
- Node.js: >= 14.0.0
- Python: 3.x (用于编译原生模块)
- C++ 编译器: GCC/Clang (Linux/macOS) 或 MSVC (Windows)
平台支持
- ✅ Linux (x64, ARM64)
- ✅ macOS (x64, ARM64)
- ✅ Windows (x64)
内存要求
- 最小: 512MB RAM
- 推荐: 2GB+ RAM (用于大数据集处理)
