npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

copilot-custom-endpoint

v1.3.14

Published

Local proxies for VS Code Copilot custom endpoints — Kimi K2 & Qwen 3.x

Downloads

3,304

Readme

GitHub Copilot Custom Endpoints

TL;DR — GitHub Copilot switched to usage-based billing on June 1, 2026. Every chat and agent session now burns AI credits — fast. This repo shows you how to plug cheaper non-GitHub models (DeepSeek, Kimi, Qwen, MiMo, MiniMax, GLM) into VS Code's Copilot chat — often 5–55× cheaper than the built-ins — while keeping agent mode, tools, streaming, and vision.

What is this?

VS Code lets you add your own language-model endpoint via a small JSON config file. Many providers advertise "OpenAI-compatible" APIs but reject the exact request shapes VS Code sends. This repo collects real, tested setups — one per provider — plus a tiny local proxy that smooths over the rough edges when needed.

If OpenRouter is blocked by your network or too generic for your model's quirks, this is the workaround.

How it works (4 steps)

  1. Pick a model from the table below.
  2. Add it to your VS Code config — copy the snippet from the model's doc.
  3. Set the API key through VS Code's UI (it goes to your OS keychain, not the file).
  4. Open chat and pick the model from the model picker.

That's it. No code, no servers to manage (unless the model specifically needs the local proxy — the table tells you).

Pick a model

| Model | Provider | Needs proxy? | Vision | Setup guide | | --------------------------- | --------- | ---------------------- | ------------ | -------------------------------------------------------------------------------------------------- | | MiMo V2 Flash | Xiaomi | No | ❌ | Setup | | MiMo V2.5 | Xiaomi | No | ✅ | Setup | | MiMo V2.5 Pro | Xiaomi | No | ❌ | Setup | | Kimi K2.6 | Moonshot | Yes | ✅ | Setup | | Qwen 3.7 Plus | DashScope | Optional | ✅ | Setup | | Qwen 3.7 Max | DashScope | Optional | ❌ | Setup | | MiniMax M3 | MiniMax | No | ✅ | Setup | | GLM 5.1 | Z.ai | No | ❌ | Setup | | GLM 5V Turbo | Z.ai | No | ✅ | Setup | | DeepSeek V4 Pro / Flash | DeepSeek | No (uses an extension) | ✅ via proxy | Marketplace |

Setup

1. Find (or create) your config file

| OS | Path | | ------- | ----------------------------------------------------------------- | | Windows | %APPDATA%\Code\User\chatLanguageModels.json | | macOS | ~/Library/Application Support/Code/User/chatLanguageModels.json | | Linux | ~/.config/Code/User/chatLanguageModels.json |

If the file doesn't exist yet, create it with [] inside.

2. Add a model entry

Open the setup guide for the model you picked (links in the table above) and copy its JSON snippet into the file. Each snippet is a single provider object inside the array.

⚠️ Leave apiKey as "" — never paste the key into the JSON file.

3. Set the API key

  1. Open the Command Palette (Ctrl+Shift+P / Cmd+Shift+P).
  2. Run Chat: Manage Language Models.
  3. Find your provider in the list, right-click the group name → Update API Key.
  4. Paste your key. It's stored in your OS keychain.

4. Chat

  • Open Copilot chat (Ctrl+Alt+I / Cmd+Ctrl+I).
  • Click the model picker (top-right).
  • Pick your model and ask something.

If a model needs a proxy, the setup guide will tell you to run a command first. Keep that terminal open while you chat.

Common commands

Run from the repo root:

npm run proxy        # Start both proxies (Kimi + Qwen)
npm run proxy:kimi   # Start only the Kimi proxy
npm run proxy:qwen   # Start only the Qwen proxy
npm run clean:logs   # Remove debug_log/
npm test             # Run the test suite

Or globally via npx (no clone needed):

npx copilot-custom-endpoint          # Start both proxies
npx copilot-custom-endpoint kimi     # Kimi only
npx copilot-custom-endpoint qwen     # Qwen only
npx copilot-custom-endpoint clean    # Remove debug_log/

Pricing snapshot

All prices are USD per 1M tokens (cache miss). 1 AI credit = $0.01. MiniMax M3 figures reflect a permanent 50% off list price — see the model doc for the full rate card.

| Model | Input | Output | Context | | ---------------------------- | ----- | ------ | ------- | | MiMo V2 Flash 🏆 | $0.10 | $0.30 | 256K | | DeepSeek V4 Flash 🏆 | $0.14 | $0.28 | 1M | | Kimi K2.6 (non-thinking) | $0.16 | $0.95 | 256K | | MiniMax M3 | $0.30 | $1.20 | 1M | | MiMo V2.5 | $0.40 | $2.00 | 1M | | Qwen 3.7 Plus | $0.40 | $1.60 | 1M | | MiMo V2.5 Pro | $1.00 | $3.00 | 1M | | GLM 5V Turbo | $1.20 | $4.00 | 200K | | GLM 5.1 | $1.40 | $4.40 | 200K | | Qwen 3.7 Max | $2.50 | $7.50 | 1M |

For the full pricing comparison (cached rates, full Copilot roster, footnotes, sources) see docs/pricing.md. For a copy-paste config containing all providers at once, see docs/example-config.md.

Companion tools

ℹ️ These are third-party tools — not built into the custom endpoints or proxies in this repo. Each one must be installed, configured, and (where applicable) billed for separately, directly with its own provider. Nothing here is bundled, proxied, or auto-configured by copilot-custom-endpoint or the per-model setups above. The entries below are just pointers to tools the author has found useful alongside the model configs.

These work alongside the providers above and fill gaps that VS Code's built-in tool surface doesn't cover natively.

🎬 Video Context MCPvideo understanding for AI coding assistants

VS Code's built-in view_image tool only accepts static images (PNG, JPG, GIF, WebP). That's a hard wall if you want to ask an AI assistant about a screen recording, a screencast, a product demo, or any other video. Several vision-capable models in this repo actually accept video natively — but VS Code's tool pipeline never gets the chance to forward it.

Video Context MCP is a small MCP server that bridges that gap. It works with GitHub Copilot, Cursor, and Claude Code out of the box, and:

  • Extracts frames from local files or remote URLs (no ffmpeg gymnastics required).
  • Routes them through a multi-provider fallback chainGemini → GLM 4.6V Flash → Qwen3.7-plus → Kimi K2.6 → MiMo-V2.5.
  • Answers natural-language questions about the video grounded in actual frames: "what does the speaker click in the last 30 seconds?", "summarize the demo", "find the frame where the error appears".
  • Extras: timestamp search, audio transcription with speaker diarization, and video metadata (resolution, duration, codec).

🪣 Bitbucket MCPsecure, read-only Bitbucket access for VS Code Copilot, Cursor, and Claude Code

GitHub ships a first-party MCP server (and it's even bundled into Copilot), so asking "what's open in my org's repos / show me PR #123" works seamlessly on github.com. Bitbucket has no equivalent — Atlassian hasn't shipped one — which leaves Bitbucket Cloud users copy-pasting PR URLs, diffs, and file contents into chat by hand.

Bitbucket MCP is a small MCP server that closes that gap. It works with VS Code GitHub Copilot, Cursor, and Claude Code out of the box, and:

  • 38 tools across 8 categories — repositories, pull requests, branches & commits, diffs & comparisons, CI/CD pipelines, issues, code search, and users.
  • Read-only by designmakeRequest() blocks all non-GET requests at runtime, so no write, delete, or modify operation is possible.
  • Token-friendly output — every tool supports text, json, and toon (Token-Oriented Object Notation) formats; toon cuts LLM token consumption by 30–60% on large PR/commit lists.
  • JMESPath filtering on all 38 tools, so you can trim responses (e.g. only open PRs, or just title + author) before they hit the model.
  • One-call PR contextbb_get_context bundles PR metadata, diffstat, CI statuses, and comments in a single request.
  • Drop-in installnpm install -g @tugudush/bitbucket-mcp plus a short .vscode/mcp.json entry, authenticated with a Bitbucket API token + your Atlassian email.

Need help?

  • Per-model issues: check the troubleshooting section at the bottom of each model's doc.
  • Repo questions / bugs: open an issue on GitHub.

Repo layout

.
├── docs/models/<provider>-<model>.md   # Per-model setup guides (the real docs)
├── proxy/                              # Local compatibility shims
├── tests/                              # Test assets
└── debug_log/                          # Runtime logs (git-ignored)

Want to add a new model?

  1. Create docs/models/<provider>-<model>.md with a clear walkthrough.
  2. If the provider needs request rewriting, add a proxy under proxy/.
  3. Submit a PR.

Limitations

  • Chat only. Inline completions, semantic search, and next-edit suggestions still need a GitHub-hosted model.
  • Each proxy is tuned for a specific provider family. Don't point the Kimi proxy at an arbitrary OpenAI-compatible endpoint.

Support

If this helped, consider sponsoring or donating:

GitHub Sponsors

Solana (SOL)

CWZccD3Ny3XotFZtnkcyzP3hapmu3ExknN1PF4rEvP3u