npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

llm-quote-extractor

v0.1.0

Published

Parse any AI response (ChatGPT, Claude, Gemini, Perplexity, AIO) for brand mentions, cited URLs, ranked recommendations, and platform fingerprint. Pure CPU regex, no LLM call.

Downloads

177

Readme

llm-quote-extractor

Parse any AI response (ChatGPT, Claude, Gemini, Perplexity, Google AI Overview) for brand mentions, cited URLs, ranked recommendations, and platform fingerprint. Pure CPU regex. No LLM call. Zero network.

npm license: MIT

A focused TypeScript library that takes an LLM-generated text response and returns a structured extraction — what brands the model named, what URLs it cited, what it recommended in what order, and which platform the response likely came from. Useful for AI search visibility tracking, brand monitoring research, citation analysis, and answer-engine optimization (AEO) workflows.

The same parsing pipeline runs the free Citare LLM Quote Extractor web tool. This package is the open-source extraction core.

Install

npm install llm-quote-extractor
# or
pnpm add llm-quote-extractor
# or
yarn add llm-quote-extractor

Quick start

import { extractFromLlmText } from "llm-quote-extractor";

const pasted = `
I'd be happy to help! For project management with AI workflows, here are my top picks:

1. **Linear** — fast, keyboard-driven, beloved by engineers (https://linear.app)
2. **Notion** — flexible knowledge base + project tracker (https://notion.so)
3. **Asana** — broad PM tool with strong integrations (https://asana.com)

Each has its strengths depending on team size and workflow style.
`;

const result = extractFromLlmText(pasted);

console.log(result.detectedPlatform);    // → "claude" (matches the "I'd be happy to help!" signature)
console.log(result.brandsNamed[0]);      // → { brand: "Linear", count: 1, ... }
console.log(result.citedUrls.length);    // → 3
console.log(result.rankedRecommendations);
// → [{ position: 1, text: "Linear — fast, keyboard-driven..." }, ...]

What it returns

type LlmQuoteExtractionResult = {
  // Platform fingerprint based on signature phrases
  detectedPlatform: "chatgpt" | "claude" | "gemini" | "perplexity" | "aio" | "unknown";

  // Brand candidates — Title-Case tokens, ranked by mention count
  brandsNamed: Array<{
    brand: string;
    count: number;
    firstSeenIndex: number;
    context: string;       // ~140-char snippet around first mention
  }>;
  topBrand: string | null; // shorthand for brandsNamed[0]?.brand

  // URLs the model cited
  citedUrls: Array<{
    url: string;
    domain: string;
    contextSnippet: string;
  }>;

  // Numbered list items if the model produced a ranking
  rankedRecommendations: Array<{
    position: number;       // the number the LLM gave (1, 2, 3...)
    text: string;
  }>;

  // Stats
  inputWordCount: number;
  sentenceCount: number;
  brandDiversity: number;   // count of distinct brand candidates
};

What it does NOT do

  • No LLM call. This is pure regex + heuristics. If you want LLM-powered disambiguation (resolving "Apple the company" vs "apple the fruit"), pair this with a separate LLM step downstream.
  • No sentiment scoring. Returns mentions; doesn't classify them as positive / negative.
  • No brand normalization. "GitHub" and "Github" are treated as the same brand by case-insensitive grouping, but "Linear" and "linear.app" are not yet merged.
  • No network requests. Doesn't fetch URLs to verify them, doesn't enrich domains.

These intentional scoping choices keep the library fast (<10ms for typical responses) and deterministic.

Platform fingerprint heuristics

The detector looks for signature phrases that are characteristic of each platform's response style:

| Platform | Signature signals | |---|---| | chatgpt | "Certainly!", "Here's...", structured markdown with **bold headers** | | claude | "I'd be happy to help", "Let me", measured first-person tone | | gemini | "Here are some", strong Google-style enumeration | | perplexity | [1] [2] numbered footnote citations | | aio | "Generative AI is experimental", concise paragraph form | | unknown | Returned when no signature matches |

The fingerprint is heuristic — false positives are possible (especially on short responses).

Brand candidate extraction

The parser extracts Title-Case tokens from the answer body and filters them against a stopword list of ~80 common non-brand words (Tuesday, January, Today, etc.). It's deliberately permissive — false positives are easier to filter downstream than false negatives are to recover.

For production brand-attribution at scale, you'll want a downstream LLM disambiguation step that takes this candidate list and confirms which are real brand names vs incidental Title-Case usage. That's the trade-off of being LLM-free: speed and zero cost at the price of perfect precision.

Where it came from

This library is the open-source extraction core of Citare, an AI search intelligence platform. The same parsing logic runs the free public tool at citare.ai/tools/llm-quote-extractor and the parsing layer inside Citare's Brand Radar (5-engine weekly brand visibility measurement).

If you find this useful and want richer measurement — disambiguated brand attribution, cross-platform 50-cell weekly dispatches, persona-anchored measurement — the Citare free tier covers one project with weekly dispatches at no cost.

Contributing

Issues and PRs welcome at github.com/ravirdp/llm-quote-extractor. The library is intentionally small and focused — major feature additions should ship as separate packages that build on this one.

License

MIT. See LICENSE.