@subql/sq-graphql-agent
v1.0.4
Published
GraphQL agent for SubQuery Network
Readme
SubQL GraphQL Agent
A TypeScript/Node.js GraphQL agent for SubQuery Network that uses LangChain and OpenAI to query GraphQL endpoints intelligently with natural language.
Overview
This toolkit provides LLM agents with the ability to interact with any GraphQL API built with SubQuery SDK through natural language, automatically understanding schemas, validating queries, and executing complex GraphQL operations.
Key Features
- Natural Language Interface: Ask questions about blockchain data in plain English
- Automatic Schema Understanding: Agents learn PostGraphile v4 patterns and SubQuery entity schemas
- Query Generation & Validation: Converts natural language to valid GraphQL queries with built-in validation
- SubQuery SDK Optimized: Works with any project built using SubQuery SDK (Ethereum, Polkadot, Cosmos, etc.)
Design Philosophy
Solving the GraphQL Schema Size Problem
Traditional GraphQL agents face a fundamental challenge: schema size exceeds LLM context limits. Most GraphQL APIs have introspection schemas that are tens of thousands of tokens, making them:
- Too large for most commercial LLMs (exceeding context windows)
- Too expensive for cost-effective query generation
- Too noisy for reliable query construction (low signal-to-noise ratio)
Our Innovative Approach: Entity Schema + Rules
Instead of using raw GraphQL introspection schemas, we developed a compressed, high-density schema representation:
Entity Schema as Compressed Knowledge
- Compact Format: 100x smaller than full introspection schemas
- Domain-Specific: Contains project-specific entities and relationships
- High Information Density: Only essential types, relationships, and patterns
- Rule-Based: Combined with PostGraphile v4 patterns for query construction
Size Comparison
Traditional Approach:
├── Full GraphQL Introspection: ~50,000+ tokens
├── Context Window Usage: 80-95%
└── Result: Often fails or generates invalid queries
Our Approach:
├── Entity Schema: ~500-1,000 tokens
├── PostGraphile Rules: ~200-300 tokens
├── Context Window Usage: 5-10%
└── Result: Reliable, cost-effective query generationBenefits
- Cost Effective: 10-20x lower token usage than traditional approaches
- Higher Accuracy: Domain-specific knowledge reduces errors
- Faster Responses: Smaller context means faster processing
- Scalable: Works consistently across different LLM models
Architecture
Core Components
- GraphQLSource - Connection wrapper for GraphQL endpoints with entity schema support
- GraphQLToolkit - LangChain-compatible toolkit providing all GraphQL tools
- GraphQL Agent Tools - Individual tools for specific GraphQL operations
Available Tools
graphql_schema_info- Get raw entity schema with PostGraphile v4 rulesgraphql_query_validator_execute- Combined validation and execution tool (validates queries, then executes them if valid)
Installation
# Install dependencies
pnpm installConfiguration
Copy the environment example file:
cp .env.example .envEdit .env and add your OpenAI API key:
OPENAI_API_KEY=your_openai_api_key_hereEnvironment Variables
# Required
OPENAI_API_KEY=your-openai-api-key
# Optional
LLM_MODEL=gpt-4o # Default model
PORT=8000 # Server port (if running API server)Usage
Basic Example
import { runLangChainGraphQLAgent } from './src/langchain-agent.js';
import type { GraphQLProjectConfig } from './src/types.js';
// Load entity schema (learn more: https://subquery.network/doc/indexer/build/graphql.html)
// This example uses SubQuery Network's schema - replace with your own project's schema
const entitySchema = `
type Indexer implements Entity {
id: ID!
ownerId: String!
active: Boolean!
rewards: [Reward!]!
projects: [Project!]!
}
type Project implements Entity {
id: ID!
owner: String!
metadata: String!
}
`;
const config: GraphQLProjectConfig = {
cid: "subquery-network",
schemaContent: entitySchema,
nodeType: GraphqlProvider.SUBQL,
updatedAt: new Date().toISOString(),
domainName: "SubQuery Network",
domainCapabilities: [
"Indexer information and performance metrics",
"Project registration and metadata",
"Staking rewards and delegation data",
"Network statistics and era information"
],
declineMessage: "I'm specialized in SubQuery Network data queries. I can help you with indexers, projects, staking rewards, and network statistics, but I cannot assist with cooking. Please ask me about SubQuery Network data instead."
};
// Note: This example uses SubQuery Network's API - replace with your own project's endpoint
const endpoint = "https://index-api.onfinality.io/sq/subquery/subquery-mainnet";
// Query with natural language
const answer = await runLangChainGraphQLAgent(
config,
"Show me the top 3 indexers with their project information",
endpoint
);
console.log(answer);Example Natural Language Queries
Note: These examples are for the SubQuery Network demo. For your own project, the queries would be specific to your indexed blockchain data.
Basic Data Retrieval
- "Show me the first 5 indexers and their IDs"
- "What projects are available? Show me their owners"
- "List all indexers with their project information"
Staking & Rewards
- "What are my staking rewards for wallet 0x123...?"
- "Show me rewards for the last era"
- "Find delegations for a specific indexer"
Performance & Analytics
- "Which indexers have the highest rewards?"
- "Show me project performance metrics"
- "List top performing indexers by era"
PostGraphile v4 Query Patterns
The agent understands PostGraphile v4 patterns automatically:
Entity Queries
- Single:
entityName(id: ID!)-> Full entity object - Collection:
entityNames(first: Int, filter: EntityFilter)-> Connection with pagination
Filtering
filter: {
fieldName: { equalTo: "value" }
amount: { greaterThan: 100 }
status: { in: ["active", "pending"] }
}Ordering
orderBy: [FIELD_NAME_ASC, CREATED_AT_DESC]Pagination
{
entities(first: 10, after: "cursor") {
nodes { id, field }
pageInfo { hasNextPage, endCursor }
}
}Agent Workflow
The agent follows this intelligent workflow:
- Relevance Check: Determines if the question relates to the project data
- Schema Analysis: Loads entity schema and PostGraphile rules (once per session)
- Query Construction: Builds GraphQL queries using PostGraphile patterns
- Validation: Validates queries against the live GraphQL schema
- Execution: Executes validated queries to get real data
- Summarization: Provides user-friendly responses based on actual results
Development
# Development mode with auto-reload
pnpm dev
# Build the project
pnpm build
# Run tests
pnpm test
# Type checking
pnpm typecheckScripts
dev- Run in development mode with file watchingbuild- Compile TypeScript to JavaScriptbuild:watch- Compile with file watchingstart- Run the compiled applicationtest- Run teststest:watch- Run tests in watch modetypecheck- Run TypeScript type checkinglint- Run ESLint (if configured)clean- Clean build output
Project Structure
sq-graphql-agent/
├── src/ # Source code
│ ├── langchain-agent.ts # Main agent implementation
│ ├── tools.ts # GraphQL tools
│ ├── types.ts # TypeScript type definitions
│ └── utils.ts # Utility functions
├── examples/ # Usage examples
├── tests/ # Test files
├── package.json # Dependencies and scripts
└── tsconfig.json # TypeScript configurationDependencies
Runtime Dependencies
@langchain/community- LangChain community tools@langchain/core- LangChain core utilities@langchain/langgraph- LangGraph for agent workflows@langchain/openai- OpenAI integrationgraphql- GraphQL query libraryopenai- OpenAI API clientpino- Fast JSON loggerzod- Schema validation
Development Dependencies
typescript- TypeScript compilertsx- TypeScript execution tooljest- Testing frameworkts-jest- Jest TypeScript preset@types/node- Node.js type definitions
Error Handling
The toolkit includes comprehensive error handling:
Network Issues
- GraphQL endpoint connectivity problems
- Timeout handling for long-running queries
- Automatic retry for transient failures
Schema Introspection Issues
- Authorization error detection (e.g., missing headers)
- Invalid endpoint error handling
- Prevents caching of failed introspection results
Query Issues
- Invalid GraphQL syntax detection
- Schema validation with detailed error messages
- Field existence verification
Performance Considerations
Query Optimization
- Always use pagination (
first: N) for collection queries - Limit nested relationship depth to avoid expensive queries
- Use specific field selection rather than querying all fields
Caching Strategy
- GraphQL schema introspection results are cached (configurable TTL)
- Entity schema is loaded once per toolkit instance
- No query result caching (always fresh data)
Resource Management
- Connection pooling for HTTP requests
- Automatic cleanup of resources
- Memory-efficient schema processing
Model Performance Comparison
Based on comprehensive testing, here's how different LLM models perform with this GraphQL agent:
| Model | Performance | Query Accuracy | Complex Reasoning | Cost Efficiency | Recommendation | |--------------------|-------------|----------------|-------------------|-----------------|----------------| | Gemini-3-flash(openrouter) | Excellent | Excellent | Excellent | Good | Recommended | | GLM-4.6 | Very Good | Very Good | Excellent | Excellent | Cost-Effective | | Kimi-k2 | Good | Fair | Fair | Very Good | Cost-Effective |
Recommendation Guidelines
For Production Use:
export LLM_MODEL="gpt-4o" # Best reliability and accuracyFor Cost-Conscious Production:
export LLM_MODEL="deepseek-v3" # Excellent value propositionFor Development/Testing:
export LLM_MODEL="gpt-4.1-mini" # Good balance for non-productionComparison with Alternatives
| Feature | SubQL GraphQL Agent | LangChain GraphQL | LangChain SQL Agents | |---------|---------------------|-------------------|----------------------------| | Schema Size Handling | Entity compression (500 tokens) | Full schema per query (50k+ tokens) | Table schemas (compact) | | Domain Flexibility | SubQuery SDK only | Any GraphQL API | Any SQL database | | Schema Learning | Learns once, reasons multiple queries | Requires schema in every query | Learns schema structure | | Natural Language | Full support | Limited by context size | Full support | | Query Construction | PostGraphile rules | Pattern matching only | Mature SQL generation | | Cost Efficiency | Low token usage | Very high token usage | Very efficient | | Security & Access | API-only, no DB access | API-only, no DB access | Requires DB credentials | | Setup Complexity | Simple for SubQuery | Schema management overhead | DB access + permissions |
License
Apache-2.0
