@hameddk/jetbrains-usage-collector

v0.1.0

Published

23 days ago

Best-effort parser for JetBrains IDE local log files (Junie, AI Assistant). Heuristic — token data depends on what the IDE chose to log. Will be replaced by an API-based collector in v1.0.0 once JetBrains Central Console exposes a public usage API.

0High
0Medium
0Low

hameddk

jetbrains intellij junie ai-assistant usage log-parser

@hameddk/jetbrains-usage-collector

Best-effort parser for JetBrains IDE local log files (IntelliJ IDEA, WebStorm, PyCharm, RustRover, Junie, AI Assistant). Scans logs on the local machine and produces per-day token-usage rows.

⚠️ Read this before depending on this package
This is a log-scraping module. JetBrains does not currently expose a public usage API. The data this collector produces depends entirely on what the IDE chose to write to its local log files at the time it was running. Specifically:
Log line formats can change without notice between JetBrains releases. A version bump can silently break this parser.
Token counts are not always logged. Some completions emit them, others don't. The numbers you see are a lower bound, not a total.
Cost is always null. Local logs do not contain cost.
Identity is the OS username by default (override via identity).
Coverage is per-machine. This collector only sees logs on the machine it runs on.
JetBrains announced in February 2026 that the JetBrains Central Console will expose an analytics API. When that lands, this collector will publish v1.0.0 and switch to API-based collection. The row shape and identity semantics will change, so the version bump is intentionally semver-major. See CHANGELOG.md → "Planned for v1.0.0".

Why ship this anyway?

Until the API arrives, log scraping is the only programmatic option for a multi-IDE engineering team. Sparse-but-real data is more useful than no data, as long as the limitations are loud and visible (which is the purpose of this README and the warnings in every meta.warnings array).

Install

npm install @hameddk/jetbrains-usage-collector

Quick start

import { runCollector } from '@hameddk/jetbrains-usage-collector';

const result = await runCollector({
  from: '2026-04-01',
  to:   '2026-04-30',
  // identity: 'alice'   // override OS username if you want
  // logRoots: ['/custom/path']  // override platform defaults
});

if (!result.ok) {
  console.error(`[${result.errorType}] ${result.error}`);
  process.exit(1);
}

for (const w of result.meta.warnings) console.warn('[warning]', w);

for (const row of result.rows) {
  console.log(`${row.date}  ${row.identity}  ${row.tool}  ${row.tokens_input}+${row.tokens_output}`);
}

API

runCollector({
  from: 'YYYY-MM-DD',         // required — UTC inclusive
  to:   'YYYY-MM-DD',         // required — UTC inclusive
  identity?: string,          // override OS username
  logRoots?: string[],        // override platform-default log directories
})

Default `logRoots`

| Platform | Directories scanned | |---|---| | macOS | ~/Library/Logs/JetBrains, ~/Library/Application Support/JetBrains | | Windows | %LOCALAPPDATA%\JetBrains | | Linux | ~/.cache/JetBrains, ~/.config/JetBrains |

Success result

{
  ok: true,
  rows: Array<{
    date: 'YYYY-MM-DD',
    identity: string,                 // OS username or override
    identityType: 'os_username',
    tool: string,                     // comma-joined model names found in the day's lines
    tokens_input: number,
    tokens_output: number,
    cost_usd: null,                   // always null — logs lack cost
    session_minutes: 0,
    raw: { log_snippets: string[], files: string[] }
  }>,
  meta: {
    via: 'log_scrape',
    files_scanned: number,
    warnings: string[],               // always includes a fragility note
    roots: string[],
  }
}

Error result

{
  ok: false,
  error: string,
  errorType: 'config',                // log scraping has no auth/network/rate_limit failures
}

The only error mode is config (bad date format, missing args). Filesystem errors (missing directories, permission denied, large files) are handled silently — the collector returns whatever data it could reach, with empty rows and a warning if nothing was found.

Heuristics in `parseLine`

The parser looks for any of these signals to consider a line as AI-related:

Junie
JetBrains AI
AICompletion
inline completion
both ai and token substrings

If a line matches, the parser extracts token counts using these patterns:

input tokens: 1500 / output tokens: 400
prompt tokens: 1500 / completion tokens: 400
tokens_in: 1500 / tokens_out: 400
1500/400 tokens (pair fallback)

If neither input nor output tokens are detected, the line is skipped. Model name extraction looks for model: / gpt: / claude: followed by an identifier; defaults to jetbrains-ai.

Errors

import { JetBrainsUsageError, JetBrainsUsageConfigError } from '@hameddk/jetbrains-usage-collector';

Testing

The logRoots option lets you point at a temp directory containing fixture log files. See test/integration.test.js for examples.

What this library does not do

Doesn't write to a database — return value is rows; persist them yourself.
Doesn't compute cost — all returned cost_usd values are null.
Doesn't aggregate across machines — only sees logs where it runs.
Doesn't ship a JetBrains API client — wait for v1.0.0.
Doesn't gracefully handle a JetBrains log format change — it will simply return fewer or zero rows. There is no contract from JetBrains guaranteeing this format is stable.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@hameddk/jetbrains-usage-collector

⚠️ Read this before depending on this package

Why ship this anyway?

Install

Quick start

API

Default logRoots

Success result

Error result

Heuristics in parseLine

Errors

Testing

What this library does not do

License

Default `logRoots`

Heuristics in `parseLine`