npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

duck_talk

v0.1.4

Published

Voice interface for Claude Code

Readme

Duck Talk

Talk to Claude Code. Hear it talk back. Approve, interrupt, or redirect — all by voice, from anywhere.

The core tech: a generic a voice layer that can wrap any black-box agent using Live Speech models (e.g. Gemini Live, OpenAI Realtime) for low latency conversations. No modifications to the agent.

             Duck Talk            Claude Code
              ┌──────┐          ╔══════════════╗
You ─speech─▶ │ STT  │ ─inst─▶  ║              ║
    ◀─audio── │ TTS  │ ◀─txt──  ║  (any agent) ║
              └──────┘          ╚══════════════╝

inst = instruction, e.g. "What is the latest PR?"
txt = raw stream of tokens 

Demo

Quick start

You will need:

Option 1 — npx (fastest)

ANTHROPIC_API_KEY=sk-ant-... GEMINI_API_KEY=AIza... npx duck_talk
# Opens http://localhost:8000

Or set them in a .env file in the current directory:

ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=AIza...

Option 2 — from source

git clone https://github.com/dhuynh95/duck_talk.git && cd duck_talk
npm install
cp .env.example .env   # then edit with your API keys
npm run dev

Why

I wanted a coding assistant I could talk to on a walk — check on a long-running task, brainstorm architecture, review a plan. Hands-free, conversational, no laptop required.

STT tools like SuperWhisper and Wispr Flow get you halfway — you can dictate, but the agent never talks back. You can bolt TTS onto Claude Code via MCP, but you're waiting for the full response before hearing anything.

Voice-native agents like ChatGPT and Gemini Live have the conversation part down, but they're not connected to your codebase. They can't run commands, edit files, or see your project. And if your accent trips up the STT — "Cloud Code" instead of "Claude Code" — there's no way to catch it before it's sent.

Nothing combines all of this:

| | Multi turn voice | Audio output | Low latency | No context bloat | Setup | |---|---|---|---|---|---| | STT dictation | ❌ Push-to-talk | ❌ | ❌ No response | ✅ | ✅ | | MCP voice tool | ❌ Keyboard | ✅ | ❌ After completion | ❌ Extra MCP | ❌ Custom MCP | | Duck Talk | ✅ | ✅ | ✅ | ✅ | ✅ |

Key features

  • Real-time voice — talk to Claude Code hands-free. Say "stop" to interrupt mid-response.
  • Streaming TTS — responses spoken sentence-by-sentence as they stream. ~1.5s to first audio, not after completion.
  • Review mode — hear your instruction read back before it's sent. Accept, edit, or reject by voice or buttons. No more "Cloud Code" when you said "Claude Code."
  • Correction learning — edit a misheard instruction, the diff is saved. Future transcriptions auto-correct.
  • Session management — browse, resume, and rewind conversations. Built on Claude Code's native JSONL format.

Architecture

Two Gemini Live sessions — one listens, one speaks. Claude Code is the black box in between.

graph LR
    You((You))
    STT["Gemini Live #1<br/>STT · VAD · Tools"]
    API["Express Server<br/>+ Agent SDK"]
    TTS["Gemini Live #2<br/>Streaming TTS"]
    CC[["Claude Code<br/>(any agent)"]]

    You -->|speech| STT
    STT -->|instruction| API
    API <-->|text stream| CC
    API -->|sentences| TTS
    TTS -->|audio| You
    API -.->|context inject| STT

Flow of a single instruction:

sequenceDiagram
    actor You
    participant STT as Gemini Live<br/>(STT · VAD)
    participant API as Express Server
    participant CC as Claude Code
    participant TTS as TTS Session

    You->>STT: 🎤 speech
    Note over STT: VAD detects end of speech
    STT->>API: converse(instruction)
    Note over STT: ⏸ frozen (BLOCKING tool)
    STT-->>STT: tool response → unfreeze

    API->>CC: query(instruction)

    loop text streaming
        CC-->>API: text chunk (SSE)
        API-->>TTS: sentence buffer flush
        TTS-->>You: 🔊 audio
        API-->>STT: context inject
    end

    Note over TTS: audio drains

License

MIT