@evolvconsulting/cc-sf
v0.2.5
Published
Local proxy that lets Claude Code use Claude models hosted in Snowflake Cortex.
Maintainers
Readme
cc-sf
Local proxy that lets Claude Code use Claude models hosted in Snowflake Cortex, authed with your existing key‑pair from your Snowflake connections.toml.
Cortex exposes an Anthropic‑Messages‑API‑compatible endpoint. Claude Code already honors ANTHROPIC_BASE_URL. cc-sf fills three gaps Claude Code can't handle on its own: minting & rotating the Snowflake RS256 keypair JWT, sending the X-Snowflake-Authorization-Token-Type: KEYPAIR_JWT header, and remapping model IDs (claude-opus-4-7, dated suffixes) to what Cortex currently serves (claude-opus-4-6, claude-haiku-4-5, …).
Install
npm i -g @evolvconsulting/cc-sfRequires Node ≥ 20 and the claude CLI on PATH.
Use
cc-sf # start bridge, drop into an interactive `claude` session
cc-sf --coco # same, but route traffic through Cortex Code's agent:run
# endpoint so usage bills as `cortex_code_cli`
cc-sf --override # same, but force opus/sonnet/haiku picks to latest
# live Cortex version for this session; revert on exit
cc-sf --bridge-only # run proxy alone, no claude (for debugging or manual curl)
cc-sf --list-models # print models available to your account, then exit
cc-sf --refresh-models # re-probe Cortex and refresh the model cache
cc-sf --jwt # mint a JWT and print to stdout
cc-sf --decode-jwt # inspect header + payload of a minted JWT
cc-sf --helpPicking a model
cc-sf does not have its own --model flag. Anything after -- is forwarded straight to the claude binary, so use claude's mechanisms — the --model flag at launch, the ANTHROPIC_MODEL env var, or /model mid-session.
cc-sf # interactive, claude's default
cc-sf -- --model claude-opus-4-7 # interactive, opus 4.7
cc-sf -- --model claude-4-sonnet # Cortex-only ID (older family,
# not in claude's built-in picker)
cc-sf -- -p "summarize README.md" # one-shot print mode (claude's -p)
cc-sf -- -p "ping" --model claude-haiku-4-5 # one-shot with explicit model
ANTHROPIC_MODEL=claude-opus-4-7 cc-sf # set the default for the session
cc-sf --override # any opus/sonnet/haiku pick
# auto-resolves to latest liveInside an interactive session, /model opens claude's picker. The picker only lists Claude Code's built-in aliases (Opus / Sonnet / Haiku); for Cortex-only IDs like claude-4-sonnet or claude-3-7-sonnet, use --model or ANTHROPIC_MODEL. Run cc-sf --list-models to see what your account can serve.
Passing other claude flags
Any claude flag works after --. cc-sf only consumes its own options; everything past -- is argv for the spawned claude process. Common examples:
cc-sf -- --dangerously-skip-permissions # skip permission prompts
cc-sf -- --continue # continue the most recent session
cc-sf -- --resume # show the resume picker
cc-sf -- --debug # claude debug logging
cc-sf -- --add-dir /path/to/extra/dir # extend the working dir set
cc-sf -- --chrome # any flag your claude build supports
cc-sf -- --dangerously-skip-permissions --model claude-opus-4-7 -p "do the thing"The -- is only required when you're passing flags claude understands but cc-sf does not — cc-sf parses its own flags first and stops at the first unknown token, so cc-sf -p "x" works too. When in doubt, use --; it's never wrong.
To combine cc-sf flags with claude flags:
cc-sf --override -- --dangerously-skip-permissions --model claude-opus-4-7Prereqs
A Snowflake connections.toml. cc-sf discovers the file using the same precedence Snowflake CLI / the Python connector use — --config > $SF_CONNECTIONS_FILE (cc-sf override) > $SNOWFLAKE_HOME/connections.toml > ~/.snowflake/connections.toml > OS default (Windows %USERPROFILE%\AppData\Local\snowflake\, macOS ~/Library/Application Support/snowflake/, Linux $XDG_CONFIG_HOME/snowflake/). default_connection_name is also read from a sibling config.toml when it's not set inline. cc-sf supports two authenticator values:
SNOWFLAKE_JWT (default) — cc-sf mints and auto-rotates a keypair JWT:
default_connection_name = "ennovate"
[ennovate]
account = "XLB91549"
user = "[email protected]"
authenticator = "SNOWFLAKE_JWT"
private_key_file = "/Users/you/.snowflake/rsa_key.p8"
role = "SYSADMIN"(private_key_path is also accepted as an alias — the two field names are interchangeable.)
Plus the matching public key uploaded to your Snowflake user (ALTER USER … SET RSA_PUBLIC_KEY='…'). SNOWFLAKE.CORTEX_USER is granted to PUBLIC by default, so most roles inherit it.
OAUTH — bring an external-OAuth bearer token yourself:
[ennovate-oauth]
account = "XLB91549"
authenticator = "OAUTH"
token_file_path = "/Users/you/.snowflake/oauth-access-token"
role = "SYSADMIN" # optionalYour IdP tooling is responsible for keeping the token file current. cc-sf reads the file lazily, caches it in memory, and re-reads it on HTTP 401 — so after your refresh script writes a new token to the same path, the bridge picks it up automatically on the next request. cc-sf does not itself speak the OAuth refresh protocol.
PAT (PROGRAMMATIC_ACCESS_TOKEN) is not supported yet; it requires a per-user network policy (an account-admin operation) and has not been tested end-to-end against Cortex.
Flags & environment
| Flag | Env | Default | Purpose |
| --- | --- | --- | --- |
| --config <path> | SF_CONNECTIONS_FILE | Snowflake discovery ladder (see Prereqs) | TOML file to read |
| — | SNOWFLAKE_HOME | — | Snowflake-standard override for the config directory; read as $SNOWFLAKE_HOME/connections.toml |
| --connection <name> | SF_CONNECTION | default_connection_name in connections.toml, then sibling config.toml | Which block to use |
| — | SF_BRIDGE_PORT | 8787 | Local bind port |
| — | SNOWFLAKE_PRIVATE_KEY_PASSPHRASE | — | Passphrase for encrypted .p8 keys |
| — | ANTHROPIC_MODEL | claude-sonnet-4-6 | Model Claude Code targets |
| — | ANTHROPIC_SMALL_FAST_MODEL | claude-haiku-4-5 | Small/fast model |
Nothing is hardcoded: account, user, key path, role, and JWT fingerprint all derive from the selected connection at runtime. If the file or a named connection is missing, cc-sf prints a clear error (including available connection names when the named one isn't found).
Platform support
macOS, Linux, and Windows. TOML discovery follows the Snowflake CLI / Python connector ladder on every OS, including %USERPROFILE%\AppData\Local\snowflake\connections.toml on Windows when ~/.snowflake/ isn't present. cc-sf spawns the claude shim with shell: true on Windows so that npm's claude.cmd resolves correctly.
How it works
cc-sfloads your connection viasmol-tomland builds an authenticator for it. ForSNOWFLAKE_JWTit mints an RS256 JWT (cached with 55‑min TTL, refreshed 5 min before expiry). ForOAUTHit readstoken_file_pathon demand and caches the bytes. Then it stands up a Hono HTTP server on127.0.0.1:${SF_BRIDGE_PORT:-8787}.POST /v1/messagesand/v1/messages/count_tokensare forwarded tohttps://<account>.snowflakecomputing.com/api/v2/cortex/v1/messages…withAuthorization: Bearer <token>. JWT connections also attachX-Snowflake-Authorization-Token-Type: KEYPAIR_JWT; OAuth connections omit that header so Cortex auto-detects the token type.- On startup the bridge probes a candidate list (
claude-opus-4-7,-4-6,-4-5,-sonnet-*,-haiku-4-5,-3-7-sonnet, …) withmax_tokens:1requests in parallel. Results are cached at~/.cache/cc-sf/models.jsonwith a 24 h TTL; force with--refresh-models. (Cortex has no listing endpoint, so probing is the only reliable source of truth.) modelin the request body is remapped against the live set: passthrough if available, else strip[1m]and dated-YYYYMMDDsuffix, else downshift minor/major versions until a match (e.g.claude-haiku-4-5-20251001→claude-haiku-4-5;claude-sonnet-4-7→claude-sonnet-4-6).- Response body (SSE or JSON) streams back untouched.
- On upstream 401 the authenticator invalidates its credential cache and the request retries once. For JWT this triggers a remint; for OAuth it re-reads
token_file_pathso a freshly-refreshed token written by your IdP tooling is picked up automatically.
Then cc-sf sets ANTHROPIC_BASE_URL=http://127.0.0.1:<port>, unsets ANTHROPIC_API_KEY, and execs claude "$@". SIGINT/SIGTERM forward to the child.
Model availability
Availability is discovered at runtime per account; run cc-sf --list-models to see yours. (As of 2026‑04 a typical account sees claude-opus-4-7, claude-opus-4-6, claude-sonnet-4-6, claude-opus-4-5, claude-sonnet-4-5, claude-haiku-4-5, claude-4-sonnet, claude-3-7-sonnet — but this drifts, which is why the bridge probes rather than hardcoding.)
--coco (Cortex Code mode)
By default cc-sf proxies to Snowflake's public Anthropic-compatible endpoint (/api/v2/cortex/v1/messages). With --coco, it instead routes traffic through the undocumented /api/v2/cortex/agent:run endpoint that Snowflake's own cortex CLI uses, so usage is attributed to Cortex Code (cortex_code_cli) for billing.
cc-sf --coco # interactive, coco mode
cc-sf --coco -- --model claude-opus-4-7 # pick a model, coco modeWhat this changes under the hood:
- Login via
/session/v1/login-requestwithCLIENT_ENVIRONMENT.APPLICATION = cortex_code_cliand a matchingQUERY_TAG, yielding a short-lived session token (re-login when it expires). - Upstream calls use
Authorization: Snowflake Token="…",User-Agent: cortex_code_cli/1.0.0, body fieldsorigin_application: coding_agentandexperimental.CodingAgent.OriginApplication: snova. - Anthropic Messages request/SSE is translated in both directions (Anthropic ↔ Cortex
agent:run). Claude Code sees a normal SSE stream; Cortex sees a native-lookingagent:runcall.
Model discovery (separate from the default /v1/messages catalog):
- On startup,
cc-sf --cocoprobes the full candidate list againstagent:runand caches the result at~/.cache/cc-sf/coco-models.json(24 h TTL, per-account).--refresh-modelsrefreshes both caches. agent:run's catalog is a subset of what/v1/messagesserves — as of Apr 2026 it lacks-4-7and Haiku. Any Claude Code request for a model outside the coco catalog is downshifted within the same family first (e.g.claude-opus-4-7→claude-opus-4-6). If no in-family match exists (e.g.claude-haiku-4-5,claude-3-7-sonnet), the model is rewritten to"auto"so Cortex picks a served option instead of erroring.- If a request still gets rejected at runtime (e.g. catalog drift between probe and request), the bridge parses the
Available models: ...list from the error, writes it to the cache, and auto-retries once with a freshly-remapped model. The user sees one coherent response.
Requirements & caveats:
SNOWFLAKE_JWTconnections only (OAUTHis rejected — session login needs the keypair).agent:runis undocumented; Snowflake can change the shape without notice.- Client-side tools are passed through as
client_mcptool specs (generic pass-through), not mapped to Cortex's built-inbash/read/grep/… catalog. Tool use works; per-tool billing attribution may differ from the real Cortex Code client. cache_control,thinking, and other Anthropic-specific fields are not guaranteed to round-trip./v1/messages/count_tokensstill uses the public endpoint (noagent:runequivalent).- Set
CC_SF_DEBUG_SSE=1to tee raw upstream SSE to stderr — useful when agent:run's shape drifts.
--override
Claude Code's /model picker shows its canonical entries (Opus/Sonnet/Haiku). If you'd rather have every pick resolve to the latest live version for that family on Cortex (e.g. so selecting any "Sonnet" uses claude-sonnet-4-6 even if Claude Code lists it as 4.5), pass --override:
cc-sf --overrideOn startup cc-sf reads ~/.claude/settings.json (or $CLAUDE_CONFIG_DIR/settings.json), merges its computed modelOverrides (old‑version IDs → latest of that family), and writes a sidecar backup at settings.json.cc-sf.bak. On exit (normal, SIGINT, SIGTERM), the original file is restored byte‑for‑byte. If cc-sf crashes before cleanup, the next invocation detects the sidecar and restores automatically. Existing keys outside modelOverrides are preserved; any pre‑existing entries in modelOverrides that don't collide are left in place.
Known gaps
Not yet exercised (may work, not verified): cache_control, thinking blocks, /v1/messages/count_tokens.
--coco mode specifically has a narrower feature surface than the public /v1/messages shim — see the caveats under --coco.
License
MIT
