npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

houtini-lite

v2.1.1

Published

Streamlined MCP server for LM Studio with dynamic token allocation

Readme

Houtini-Lite 🎩

A streamlined MCP (Model Context Protocol) server for LM Studio with intelligent dynamic token allocation. Execute custom prompts on your local LLMs with automatic token optimisation for maximum output.

Features

  • 🚀 Dynamic Token Allocation: Automatically maximises output tokens based on your model's context window
  • 💡 Smart Context Management: Uses 80% of available context with safety margins to prevent overflow
  • 🎯 Simple & Focused: Streamlined toolset for prompt execution without complexity
  • 📊 Transparent Diagnostics: See exactly how tokens are allocated in every response
  • 🔧 Flexible Override: Manual control when you need specific token limits

Why Houtini-Lite?

Unlike standard MCP servers that use fixed token limits, Houtini-Lite intelligently allocates tokens based on your prompt size and model capabilities. Send a simple prompt? Get 100,000+ tokens of output. Send a large context? Automatically scales to fit.

Installation

Prerequisites

  1. LM Studio (v0.3.0 or later)

    • Download from: https://lmstudio.ai/
    • Enable the local server (port 1234)
    • Load a model (e.g., Qwen3 30B, LLaMA, DeepSeek)
  2. Node.js (v18 or later)

    • Download from: https://nodejs.org/
  3. Claude Desktop

    • Download from: https://claude.ai/download

Quick Install (via npm)

  1. Install globally from npm

    npm install -g houtini-lite
  2. Configure Claude Desktop

    Add to your claude_desktop_config.json:

    {
      "mcpServers": {
        "houtini-lite": {
          "command": "npx",
          "args": ["houtini-lite"],
          "env": {
            "LM_STUDIO_URL": "ws://localhost:1234"
          }
        }
      }
    }

    Windows config location: %APPDATA%\Claude\claude_desktop_config.json Mac config location: ~/Library/Application Support/Claude/claude_desktop_config.json

  3. Restart Claude Desktop

Install from Source

  1. Clone the repository

    git clone https://github.com/houtini-ai/houtini-lite.git
    cd houtini-lite
  2. Install dependencies

    npm install
  3. Build the project

    npm run build
  4. Configure Claude Desktop

    Add to your claude_desktop_config.json:

    {
      "mcpServers": {
        "houtini-lite": {
          "command": "node",
          "args": ["C:\\path\\to\\houtini-lite\\dist\\index.js"],
          "env": {
            "LM_STUDIO_URL": "ws://localhost:1234"
          }
        }
      }
    }
  5. Restart Claude Desktop

Usage

Basic Commands

Health Check

Verify connection and see model capabilities:

Use houtini-lite:health_check

Simple Prompt

Let dynamic allocation maximise your output:

Use houtini-lite:custom_prompt with prompt: "Explain quantum computing"

With Context

Provide additional context for better responses:

Use houtini-lite:custom_prompt with:
- prompt: "Analyse this code for security issues"
- context: "[paste your code here]"

Manual Token Control

Override automatic allocation when needed:

Use houtini-lite:custom_prompt with:
- prompt: "Give a brief summary"
- maxTokens: 200

Batch Processing

Execute multiple prompts efficiently:

Use houtini-lite:batch_prompts with:
- prompts: [
    {"prompt": "First question"},
    {"prompt": "Second question", "maxTokens": 500}
  ]
- combineResults: true

Advanced Features

Temperature Control

Adjust creativity vs consistency:

Use houtini-lite:custom_prompt with:
- prompt: "Write a creative story"
- temperature: 0.9  (0.0 = deterministic, 1.0 = creative)

File-Based Prompts

Load prompts from files with variable substitution:

Use houtini-lite:execute_file_prompt with:
- filePath: "C:\\prompts\\analysis.txt"
- variables: {"project": "MyApp", "language": "Python"}

Dynamic Token Allocation

How It Works

  1. Context Detection: Identifies your model's context window (e.g., 128K for Qwen3)
  2. Safety Margin: Uses 80% of total context to prevent overflow
  3. Input Estimation: Calculates tokens needed for your prompt (~3 chars per token)
  4. Output Maximisation: Allocates all remaining space for output
  5. Smart Scaling: Automatically reduces output tokens for large inputs

Example Allocations

| Scenario | Model Context | Input Size | Output Allocated | |----------|--------------|------------|------------------| | Simple prompt | 128K | 50 tokens | ~102,000 tokens | | Medium context | 128K | 10K tokens | ~92,000 tokens | | Large context | 128K | 50K tokens | ~52,000 tokens | | Manual override | 128K | Any | Your specified limit |

Token Info in Responses

Every response includes diagnostic information:

[Your LLM's response here...]

[Token Allocation Info]
Model: qwen.qwen3-coder-30b-a3b-instruct
Context Window: 128,000 tokens
Usable Context: 102,400 tokens
Allocated Output Tokens: 102,350
Input Estimate: 50 tokens
Execution Time: 3500ms
Temperature: 0.7
Needs Chunking: No

Supported Models

Houtini-Lite automatically detects context windows for:

  • Qwen3 Series: 128K context
  • LLaMA Models: 32K context
  • CodeLlama: 16K context
  • DeepSeek: 32K context
  • Meta-LLaMA: 8K context
  • Others: Defaults to safe limits

Troubleshooting

"No models loaded in LM Studio"

  • Open LM Studio and load a model
  • Ensure the local server is running (bottom bar should show "Server Running")

"LM Studio connection failed"

  • Check LM Studio is running on port 1234
  • Try restarting LM Studio's server
  • Verify firewall isn't blocking local connections

"Tool not found" in Claude

  • Restart Claude Desktop completely
  • Check your claude_desktop_config.json syntax
  • Ensure the path to index.js is absolute and correct

Token allocation seems wrong

  • Different models have different context windows
  • Check the health_check output to verify detected context size
  • Some models may report incorrect context sizes

Configuration

Environment Variables

  • LM_STUDIO_URL: WebSocket URL for LM Studio (default: ws://localhost:1234)

Default Settings

Edit these in the source code if needed:

  • contextUsageRatio: 0.8 (use 80% of context)
  • minOutputTokens: 1000 (minimum reserved for output)
  • tokenEstimateRatio: 3 (characters per token estimate)
  • defaultTemperature: 0.7
  • timeout: 120000ms (2 minutes)

Development

Project Structure

houtini-lite/
├── src/
│   └── index.ts        # Main server implementation
├── dist/               # Compiled JavaScript (git ignored)
├── package.json        # Dependencies and scripts
├── tsconfig.json       # TypeScript configuration
└── README.md          # This file

Building from Source

# Install dependencies
npm install

# Build once
npm run build

# Watch mode for development
npm run watch

Adding New Models

To add context window detection for new models, edit the knownContextSizes object in src/index.ts:

const knownContextSizes: Record<string, number> = {
  'your-model': 32000,  // Add your model here
  // ... existing models
};

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

Areas for Contribution

  • Additional model context detection
  • Token estimation improvements
  • New prompt management features
  • Performance optimisations
  • Documentation improvements

License

MIT License - see LICENSE file for details

Acknowledgements

Version History

v2.1.0 (Current)

  • Dynamic token allocation system
  • Automatic context window detection
  • Improved error handling
  • Token diagnostics in responses

v2.0.0

  • Initial standalone release
  • Core prompt execution features
  • Basic MCP integration

Note: This is a community project and is not officially affiliated with Anthropic, LM Studio, or the original Houtini project.