rdkit-agent
v0.1.1
Published
Agent-first cheminformatics CLI using Node.js + RDKit WASM
Maintainers
Readme
rdkit-agent
Agent-first cheminformatics CLI powered by RDKit WASM. Validates, converts, and analyzes chemical notation (SMILES, SMIRKS, InChI) with structured JSON output. Works as a CLI, Node.js library, and MCP server.
Installation
npm install -g rdkit-agentRequires Node.js ≥ 16. No native build steps — RDKit runs as WebAssembly.
Quick Start
# Validate a SMILES string
rdkit-agent check --smiles "c1ccccc1"
# Compute molecular descriptors
rdkit-agent descriptors --smiles "CCO"
# Convert SMILES to InChI
rdkit-agent convert --from smiles --to inchi --input "CCO"
# Find similar molecules
rdkit-agent similarity --query "c1ccccc1" --targets "Cc1ccccc1,CCO,c1ccc2ccccc2c1" --threshold 0.5Output is JSON when stdout is not a terminal (piped/redirected). Pass --output json to force it.
Commands
| Command | Description |
|---------|-------------|
| check | Pre-flight validation for SMILES, SMIRKS, and reaction balance |
| repair-smiles | Deterministically repair/reconstruct malformed SMILES into valid canonical guesses |
| convert | Convert between SMILES, InChI, InChIKey, MOL, SDF |
| descriptors | Compute MW, logP, TPSA, HBD, HBA, rotatable bonds, rings |
| balance | Check atom balance for reactions |
| fg | Detect functional groups (tiered consuming SMARTS catalog) |
| subsearch | SMARTS substructure search |
| fingerprint | Generate Morgan or topological fingerprints |
| similarity | Tanimoto similarity search |
| scaffold | Extract Murcko scaffold |
| filter | Filter molecules by descriptor ranges (Lipinski Ro5, etc.) |
| draw | Render molecule to SVG/PNG with optional atom/bond highlighting |
| stats | Dataset statistics across descriptors |
| edit | Molecular transformations (neutralize, sanitize, add-h, etc.) |
| rings | Ring analysis (count, aromaticity, spiro atoms) |
| react | Apply a reaction SMIRKS to reactant SMILES → product SMILES |
| stereo | Stereocentre analysis (tetrahedral + E/Z, CIP codes, specified vs unspecified) |
| tautomers | Enumerate tautomers (see WASM Limitations) |
| atom-map | Atom mapping: add / remove / check / list sub-commands |
| schema | Inspect JSON schemas for any command |
| mcp | Start MCP stdio server |
| version | Show version info |
Common flags
--json '{"smiles":"CCO"}' # Pass arguments as JSON object
--json - # Read JSON from stdin
--fields "MW,logP" # Limit output to specific fields
--output json # Force JSON outputcheck
rdkit-agent check --smiles "CCO"
rdkit-agent check --smiles "H2O" # → corrects alias to "O"
rdkit-agent check --smirks "[C:1][OH]>>[C:1]=O"
rdkit-agent check --reactants "CC,OO" --products "CCO,O"Output keys: overall_pass, summary, checks, failed_checks, fix_suggestions, corrected_values
repair-smiles
rdkit-agent repair-smiles --input "C1CC" # ring-closure repair
rdkit-agent repair-smiles --input "H2O" # alias/formula repair
rdkit-agent repair-smiles --json '{"molecules":["C1CC","Na+"]}'Output keys: success, canonical_smiles, strategy, confidence, intent, attempts
descriptors
rdkit-agent descriptors --smiles "CCO"
rdkit-agent descriptors --json '{"molecules":["CCO","c1ccccc1"]}'
rdkit-agent descriptors --smiles "CCO" --fields "MW,logP,TPSA"convert
rdkit-agent convert --from smiles --to inchi --input "CCO"
rdkit-agent convert --from smiles --to inchikey --input "c1ccccc1"
rdkit-agent convert --from smiles --to mol --input "CCO"similarity
rdkit-agent similarity --query "c1ccccc1" --targets "Cc1ccccc1,CCO" --threshold 0.5 --top 5filter
rdkit-agent filter --smiles "CCO,CC(=O)Oc1ccccc1C(O)=O" --mw-max 100 --logp-max 3
rdkit-agent filter --smiles "CCO,CC(=O)Oc1ccccc1C(O)=O" --lipinskidraw
rdkit-agent draw --smiles "c1ccccc1" --output benzene.svg --format svg
rdkit-agent draw --smiles "c1ccccc1" --width 400 --height 400 --output large.svg
# Highlight atoms 0 and 1 in red, atom 3 in blue
rdkit-agent draw --smiles "c1ccccc1" \
--highlight-atoms '{"0":"#ff0000","1":"#ff0000","3":"#0000ff"}' \
--highlight-radius 0.4
# Highlight bond 1 in green
rdkit-agent draw --smiles "c1ccccc1" \
--highlight-bonds '{"1":"#00ff00"}'--highlight-atoms and --highlight-bonds accept JSON objects mapping index (string) → CSS hex colour. --highlight-radius sets the highlight circle size (default 0.3).
edit
rdkit-agent edit --smiles "[NH4+].[OH-]" --operation neutralize
rdkit-agent edit --smiles "CCO" --operation add-h
rdkit-agent edit --smiles "[H]OCC" --operation remove-h
rdkit-agent edit --smiles "[CH3:1][OH:2]" --operation strip-mapsreact
Apply a reaction SMIRKS to one or more reactant SMILES and receive the product SMILES.
rdkit-agent react --smirks "[C:1][OH]>>[C:1]Br" --reactants "CCO,CCCO"
# → { "reaction": "...", "reactant_count": 2, "products": [["CCBr"], ["CCCBr"]] }Reactants can be comma-separated or space-separated (positional args after the flags).
WASM note: requires
get_rxn/run_reactantsin the WASM build. If those are absent aNOT_SUPPORTED_IN_WASMerror is thrown — see WASM Limitations.
Programmatic:
const { reactionApply } = require('rdkit-agent');
const result = await reactionApply({ smirks: '[C:1][OH]>>[C:1]Br', reactants: ['CCO', 'CCCO'] });stereo
Analyse stereocentres in a molecule. Reports tetrahedral and E/Z stereocentres with specified/unspecified status and CIP codes when available.
rdkit-agent stereo --smiles "CC(O)C(N)C"
# → { stereo_centers: [...], stereo_center_count: 2, specified_count: 0, has_unspecified_stereo: true }
rdkit-agent stereo --smiles "OC1=CC=CC=C1,CC(F)Cl" # comma-separated batchThe --enumerate flag will attempt to list all stereo isomers. This requires enumerate_stereocenters in the WASM build — see WASM Limitations.
Programmatic:
const { analyzeStereo } = require('rdkit-agent');
const result = await analyzeStereo('CC(O)C(N)C');tautomers
Enumerate tautomers of a molecule.
rdkit-agent tautomers --smiles "OC1=CC=CC=C1" --limit 10
# → { input_smiles: "...", canonical_tautomer: "Oc1ccccc1", tautomers: [...], count: 3 }WASM note:
TautomerEnumeratoris not available in the standard RDKit WASM build. ANOT_SUPPORTED_IN_WASMerror will be thrown — see WASM Limitations.
Programmatic:
const { enumerateTautomers } = require('rdkit-agent');
const result = await enumerateTautomers({ smiles: 'OC1=CC=CC=C1', limit: 10 });atom-map
Manage atom mapping numbers in SMILES and SMIRKS.
# List atom_index → map_number
rdkit-agent atom-map list --smiles "[CH3:1][CH2:2][OH:3]"
# → { atom_maps: { "0": 1, "1": 2, "2": 3 }, mapped_atom_count: 3 }
# Add sequential map numbers to all heavy atoms
rdkit-agent atom-map add --smiles "CCO"
# → { mapped_smiles: "[CH3:1][CH2:2][OH:3]" }
# Strip all map numbers
rdkit-agent atom-map remove --smiles "[CH3:1][CH2:2][OH:3]"
# → { canonical_smiles: "CCO" }
# Validate SMIRKS mapping balance
rdkit-agent atom-map check --smirks "[C:1][OH:2]>>[C:1]Br"
# → { valid: true, mapped_atoms: 1, unmapped_atoms: 1, balanced: false, ... }Programmatic:
const { atomMapList, atomMapAdd, atomMapRemove, atomMapCheck } = require('rdkit-agent');WASM Limitations
Some RDKit features are not available in the WebAssembly build (@rdkit/rdkit). When these are called, a structured error with code: "NOT_SUPPORTED_IN_WASM" is thrown instead of silently failing.
| Feature | Status | Python alternative |
|---------|--------|-------------------|
| Reaction application (react) | Available in @rdkit/rdkit ≥ 2022.03 via get_rxn | AllChem.RunReactants |
| Stereo enumeration (stereo --enumerate) | Not available in standard builds | EnumerateStereoisomers.EnumerateStereoisomers |
| Tautomer enumeration (tautomers) | Not available in standard builds | rdMolStandardize.TautomerEnumerator |
To use these features in Python:
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem.MolStandardize import rdMolStandardize
from rdkit.Chem.EnumerateStereoisomers import EnumerateStereoisomers
# Reactions
rxn = AllChem.ReactionFromSmarts('[C:1][OH]>>[C:1]Br')
products = rxn.RunReactants((Chem.MolFromSmiles('CCO'),))
# Tautomers
te = rdMolStandardize.TautomerEnumerator()
tautomers = te.Enumerate(Chem.MolFromSmiles('OC1=CC=CC=C1'))
# Stereo enumeration
isomers = list(EnumerateStereoisomers(Chem.MolFromSmiles('CC(O)C(N)C')))Data Files
data/aliases.json: Alias/formula normalization map used by hardening and validation (H2O -> O,AcOH -> CC(O)=O).data/fg_patterns.json: Curated tiered+consuming SMARTS set used byfgfor stable, low-overlap functional-group assignment.data/checkmol_smarts_part1.csv: Broader checkmol-derived SMARTS catalog used byrepair-smilesintent scoring (ring/chain/FG hint ranking), not by the mainfgcommand.
MCP Server (Claude Desktop)
Start the MCP stdio server to expose all commands as tools:
rdkit-agent mcpAdd to your Claude Desktop claude_desktop_config.json:
{
"mcpServers": {
"rdkit-agent": {
"command": "rdkit-agent",
"args": ["mcp"]
}
}
}Node.js API
const { check, descriptors, convert, similarity, RDKIT_TOOLS, handleToolCall } = require('rdkit-agent');
// Always validate before using chemistry strings
const result = await check({ smiles: 'CCO' });
if (!result.overall_pass) {
console.error(result.fix_suggestions);
}
// Compute descriptors
const desc = await descriptors({ smiles: 'CCO' });
console.log(desc.MW, desc.logP);
// Convert format
const inchi = await convert({ input: 'CCO', from: 'smiles', to: 'inchi' });
// Similarity search
const hits = await similarity({
query: 'c1ccccc1',
targets: ['Cc1ccccc1', 'CCO', 'c1ccc2ccccc2c1'],
threshold: 0.5
});OpenAI Tool Integration
const { RDKIT_TOOLS, handleToolCall } = require('rdkit-agent');
const response = await openai.chat.completions.create({
model: 'gpt-4o',
tools: RDKIT_TOOLS,
messages: [{ role: 'user', content: 'Is CCO a valid SMILES?' }]
});
for (const toolCall of response.choices[0].message.tool_calls ?? []) {
const result = await handleToolCall(
toolCall.function.name,
JSON.parse(toolCall.function.arguments)
);
// result is JSON-serializable
}Exit Codes
| Code | Meaning |
|------|---------|
| 0 | Success |
| 1 | Validation failure (overall_pass = false) |
| 2 | Usage error (bad arguments, missing input) |
| 3 | RDKit error (WASM not loaded, molecule parse failure) |
Agent Use (SKILL.md)
For use with AI agents (Claude, GPT, etc.), see SKILL.md which ships with the package. It documents critical invariants, error patterns, and all command schemas in agent-optimized format.
License
MIT
