@keystrokehq/bigquery
v0.0.8
Published
Google BigQuery integration for Keystroke workflows. Provides typed actions for the BigQuery REST v2 surface, the BigQuery Connection / Reservation / Data Transfer / Analytics Hub / BI Engine APIs, and polling + push triggers for jobs, transfer runs, and
Readme
@keystrokehq/bigquery
Google BigQuery integration for Keystroke workflows. Provides typed actions for the BigQuery REST v2 surface, the BigQuery Connection / Reservation / Data Transfer / Analytics Hub / BI Engine APIs, and polling + push triggers for jobs, transfer runs, and table changes.
See PLAN.md for the full design spec and IMPLEMENTATION_NOTES.md for the
v1 decisions (auth model, gRPC scope, trigger primitive assignment).
Installation
Workspace-only; depends on @keystrokehq/core and
@keystrokehq/integration-authoring. The package is registered in
@keystrokehq/official-integration-catalog and discovered automatically by the
Keystroke runtime.
Credentials
BigQuery ships with a kind: 'manual' connection. Users paste a Google
service-account JSON (primary path) or, alternatively, OAuth2
client-id/secret/refresh-token (user-flow path) or Workload Identity
Federation configuration (WIF path). The package signs its own access tokens
internally — there is no Keystroke-managed OAuth flow for BigQuery.
Vault keys used by this integration:
| Key | Used for |
| ---------------------------------- | -------------------------------------------------------- |
| GCP_PROJECT_ID | Default billing / quota project. Required. |
| GCP_QUOTA_PROJECT_ID | Overrides x-goog-user-project header. Optional. |
| GOOGLE_SERVICE_ACCOUNT_JSON | Full SA JSON blob. Primary auth path. |
| GCP_OAUTH_CLIENT_ID + _SECRET + _REFRESH_TOKEN | OAuth user-flow path. Optional. |
| GCP_WIF_AUDIENCE + GCP_WIF_SUBJECT_TOKEN_SUPPLIER + GCP_WIF_SERVICE_ACCOUNT_EMAIL | Workload Identity Federation path. Optional. |
| GCP_SCOPES | Override the default scope bundle. Optional. |
At least one of the three auth paths must be populated. The client picks service-account first, falls back to OAuth refresh, falls back to WIF.
Quickstart
import { bigquery } from '@keystrokehq/bigquery/connection';
import { runQuery } from '@keystrokehq/bigquery/queries';
import { pollJobCompletion } from '@keystrokehq/bigquery/triggers';
const action = runQuery;
const trigger = pollJobCompletion({
projectId: 'my-project',
jobId: 'my-job-id',
location: 'US',
});Operations
Actions are exported from domain subpaths — never from the package root:
| Subpath | Contents |
| ---------------------------------------------------- | ----------------------------------------------------------------- |
| @keystrokehq/bigquery/projects | listProjects, getProjectServiceAccount |
| @keystrokehq/bigquery/datasets | listDatasets, getDataset, createDataset, updateDataset, patchDataset, deleteDataset, undeleteDataset, listDatasetAccess |
| @keystrokehq/bigquery/tables | listTables, getTable, createTable, updateTable, patchTable, deleteTable |
| @keystrokehq/bigquery/tabledata | listTableData (streaming inserts now canonical via streamRows in streaming-inserts) |
| @keystrokehq/bigquery/jobs | listJobs, getJob, cancelJob, deleteJob |
| @keystrokehq/bigquery/queries | runQuery, insertQueryJob, getQueryResults, dryRunQuery, cancelQueryJob, runQueryToCompletion |
| @keystrokehq/bigquery/load-jobs | insertLoadJob, uploadLoadJob, insertLoadJobResumable |
| @keystrokehq/bigquery/copy-jobs | insertCopyJob |
| @keystrokehq/bigquery/extract-jobs | insertExtractJob |
| @keystrokehq/bigquery/models | listModels, getModel, patchModel, deleteModel |
| @keystrokehq/bigquery/routines | listRoutines, getRoutine, insertRoutine, updateRoutine, deleteRoutine |
| @keystrokehq/bigquery/row-access-policies| listRowAccessPolicies, getRowAccessPolicy, createRowAccessPolicy, updateRowAccessPolicy, deleteRowAccessPolicy, batchDeleteRowAccessPolicies, getRowAccessPolicyIamPolicy, setRowAccessPolicyIamPolicy, testRowAccessPolicyIamPermissions |
| @keystrokehq/bigquery/iam | getDatasetIamPolicy, setDatasetIamPolicy, testDatasetIamPermissions, getTableIamPolicy, setTableIamPolicy, testTableIamPermissions |
| @keystrokehq/bigquery/streaming-inserts | streamRows (REST, backed by tabledata.insertAll); createWriteStream, getWriteStream, appendRows, finalizeWriteStream, flushRows, commitWriteStreams are typed and throw { kind: 'unavailable' } in v1 — gRPC transport lands in v1.1. |
| @keystrokehq/bigquery/storage-read | createReadSession, readRows, splitReadStream — typed stubs that throw { kind: 'unavailable' } in v1. |
| @keystrokehq/bigquery/connections | listConnections, getConnection, createConnection, updateConnection, deleteConnection, getConnectionIamPolicy, setConnectionIamPolicy, testConnectionIamPermissions |
| @keystrokehq/bigquery/reservations | 19 actions covering reservations, capacity commitments, assignments, and BI reservation. |
| @keystrokehq/bigquery/transfers | 13 actions covering the Data Transfer Service (data sources, configs, runs, logs). |
| @keystrokehq/bigquery/analytics-hub | 17 actions covering data exchanges, listings, subscriptions, and their IAM. |
| @keystrokehq/bigquery/bi-engine | getBiEngineReservation, updateBiEngineReservation |
Queries
import {
runQuery,
dryRunQuery,
runQueryToCompletion,
} from '@keystrokehq/bigquery/queries';
type _QueryActions = typeof runQuery | typeof dryRunQuery | typeof runQueryToCompletion;Every query action applies a maximumBytesBilled guardrail by default
(10 GB). Pass allowUnboundedCost: true to override with no limit.
Datasets & tables
import { listDatasets, createDataset } from '@keystrokehq/bigquery/datasets';
import { createTable, listTables } from '@keystrokehq/bigquery/tables';
type _DatasetActions = typeof listDatasets | typeof createDataset;
type _TableActions = typeof listTables | typeof createTable;Streaming inserts
import { streamRows } from '@keystrokehq/bigquery/streaming-inserts';
type _StreamingAction = typeof streamRows;streamRows is backed by tabledata.insertAll (REST) in v1. It provides
best-effort dedup via insertId for ~1 minute. The Storage Write gRPC path
(createWriteStream, appendRows, finalizeWriteStream, ...) is typed but
not runnable in v1 and throws { kind: 'unavailable' } at runtime.
Data Transfer Service
import {
listDataSources,
createTransferConfig,
startManualTransferRuns,
} from '@keystrokehq/bigquery/transfers';
type _TransferActions =
| typeof listDataSources
| typeof createTransferConfig
| typeof startManualTransferRuns;Analytics Hub
import {
listDataExchanges,
subscribeListing,
} from '@keystrokehq/bigquery/analytics-hub';
type _AnalyticsHubActions = typeof listDataExchanges | typeof subscribeListing;Triggers
import {
pollNewRows,
pollJobCompletion,
pollAnyJobCompleted,
pollTransferRunStatus,
pollScheduledQueryRun,
pollTableRowCountChange,
onPubSubJobNotification,
type StorageWriteCdcRowHandler,
} from '@keystrokehq/bigquery/triggers';
pollNewRows({
datasetId: 'analytics',
tableId: 'events',
watermarkColumn: 'ts',
});
pollJobCompletion({
jobId: 'my-job',
location: 'US',
});
pollAnyJobCompleted({});
pollTransferRunStatus({
location: 'us',
transferConfigId: 'abc',
});
pollScheduledQueryRun({
location: 'us',
transferConfigId: 'abc',
});
pollTableRowCountChange({
datasetId: 'analytics',
tableId: 'events',
});
onPubSubJobNotification({
idempotencyKey: (payload) => payload.message.messageId,
});onStorageWriteCdc (PLAN.md § 7 T7) is a gRPC-streaming trigger that
requires a primitive not present in @keystrokehq/core as of v1. In
its place, this package exports the StorageWriteCdcRowHandler type so
callers who run their own CDC subscriber can type their handler:
import type { StorageWriteCdcRowHandler } from '@keystrokehq/bigquery/triggers';
const handler: StorageWriteCdcRowHandler = {
handle: async (row) => {
// row.streamName, row.offset, row.row, row.changeType
// Forward into your workflow runtime however you like.
},
};Caveats
- Locations. BigQuery rejects cross-location copy / extract jobs. The
copy-jobsandextract-jobsactions fail fast on mismatched locations. - gRPC not shipped. Storage Write (
streaming-inserts.tsactions exceptstreamRows) and Storage Read (storage-read.ts) throw{ kind: 'unavailable' }in v1. - Quota project. Service accounts without billing attached must send
x-goog-user-project; setGCP_QUOTA_PROJECT_IDor the client falls back toGCP_PROJECT_ID. - Query cost guardrails. All query actions default
maximumBytesBilledto 10 GB. PassallowUnboundedCost: trueto remove the cap.
Support-surface imports
For lower-level use-cases:
import { bigquery, type BigQueryCredentials } from '@keystrokehq/bigquery/connection';
import { createBigQueryClient } from '@keystrokehq/bigquery/client';
import { datasetSchema, jobSchema } from '@keystrokehq/bigquery/schemas';
import { verifyGooglePubSubJwt } from '@keystrokehq/bigquery/verification';_official and _runtime subpaths are Keystroke-internal and
not part of the public contract.
