@quantum-scale/scaleset

v0.1.1

Published

5 months ago

Client library for GitHub Actions Runner Scale Set APIs

0High
0Medium
0Low

lindluni

github-actions runner scale-set autoscaling self-hosted-runner

GitHub Actions Runner Scale Set Client for Node.js (Public Preview)

Status: Public Preview – While the API is stable, interfaces and examples in this repository may change.

This package (@actions/scaleset) provides a standalone Node.js client for the GitHub Actions Runner Scale Set APIs. It is a JavaScript reimplementation of the Go github.com/actions/scaleset package, allowing platform teams, integrators, and infrastructure providers to build their own custom autoscaling solutions for GitHub Actions runners using Node.js.

You do not need to adopt the full controller (and Kubernetes) to take advantage of scale sets. This package contains all the primitives you need: create/update/delete scale sets, generate just‑in‑time (JIT) runner configs, and manage message sessions.

What is a Scale Set?

A runner scale set is a group of self-hosted runners that autoscales based on workflow demand. Here's how it works:

Registration: You create a scale set with a name, which also serves as the label workflows use to target it (e.g., runs-on: my-scale-set). Multiple labels can be assigned per scale set. Like regular self-hosted runners, scale sets can be registered at the repository, organization, or enterprise level.
Polling: Your scale set client continuously polls the API, reporting its maximum capacity (how many runners it can produce).
Job matching: GitHub matches jobs to your scale set based on the label and runner group policies, just like regular self-hosted runners.
Scaling signal: The API responds with how many runners your scale set needs online (statistics.totalAssignedJobs).
Runner provisioning: Your client creates or maintains enough runners to meet demand. Runners can be created just-in-time as jobs arrive, or pre-provisioned ahead of demand to reduce latency.
Job assignment: GitHub assigns a pending job to any idle runner in the scale set.

Runners in a scale set are ephemeral by default: each runner executes one job and is then removed. This ensures a clean environment for every job.

High-Level Flow

Create a Client with either a GitHub App credential (recommended) or a PAT.
Create a Runner Scale Set with a name.
Start a message session and poll for scaling events. The Listener class handles this for you.
When the API indicates runners are needed:
- Call generateJitRunnerConfig to get a JIT config for a new runner.
- Start your runner (process, container, VM, etc.) with the JIT config.
Idle runners are assigned jobs automatically by GitHub.

You can also pre-provision runners before jobs arrive to reduce startup latency. See examples/docker-scaleset for a complete example that supports both minRunners (pre-provisioned) and just-in-time scaling.

Autoscaling

Use statistics.totalAssignedJobs from each message response to determine how many runners your scale set needs online. This value represents the total number of jobs assigned to your scale set, including both jobs waiting for a runner and jobs already running (totalAssignedJobs >= totalRunningJobs).

Do not count individual job messages (JobAssigned, JobStarted, JobCompleted) in the response body to determine scaling:

Responses contain at most 50 messages. Large backlogs will be truncated.
The statistics field is always current and reflects the true state of your scale set.

When polling for messages, include your scale set's maximum capacity via the maxCapacity parameter (sent as the X-ScaleSetMaxCapacity header). This allows the backend to assign jobs accurately and avoid creating backlogs your scale set cannot fulfill.

Here's a simplified polling loop:

let lastMessageId = 0;

while (true) {
  const msg = await sessionClient.getMessage(lastMessageId, maxCapacity);

  if (msg === null) {
    // No messages available (202 response), poll again
    continue;
  }

  lastMessageId = msg.messageId;

  // Scale based on statistics, not message counts
  const desiredRunners = msg.statistics.totalAssignedJobs;
  await scaleToDesired(desiredRunners);

  // Acknowledge the message
  await sessionClient.deleteMessage(msg.messageId);
}

The Listener class provides a ready-to-use implementation of this pattern, handling session management, polling, and acknowledgment. See Listener.

Job lifecycle messages

Individual job messages (JobStarted, JobCompleted, etc.) are useful for purposes beyond scaling. For example, actions-runner-controller uses JobStarted to mark runner pods as busy, preventing premature cleanup during scale-down. These messages can also be used for metrics or logging.

See src/types.js for payload definitions.

How the Message API Works

Long Polling

getMessage uses long polling:

If messages are available, they are returned immediately.
Otherwise, the request blocks for up to ~50 seconds.
If no messages arrive, a 202 response is returned (null in the Node.js client).

Poll again immediately after handling each response.

Message Acknowledgment

Call deleteMessage after processing a message. This acts as an acknowledgment:

Unacknowledged messages are redelivered on the next poll.
This prevents message loss if your client crashes mid-processing.

Message ID Tracking

Pass the ID of the last processed message to getMessage. Omitting this (or passing 0) returns the first available message, potentially causing reprocessing.

Job Reassignment

Jobs may appear multiple times as JobAssigned followed by JobCompleted (with result: "canceled"). This occurs when a job is assigned to your scale set but not acquired by a runner in time—GitHub cancels the assignment and requeues the job. This can happen up to 3 times with incremental delays.

Each attempt generates new messages, but they represent the same workflow job. This is why statistics.totalAssignedJobs is the correct scaling metric: it reflects the current state, not the message history.

Getting Started

npm install @actions/scaleset

Import (ESM only):

import { Client, GitHubAppAuth, Listener } from '@actions/scaleset';

Authentication

Two options:

GitHub App (preferred): Stronger scoping & rotation. Provide: clientId, installationId, privateKey.
PAT (personal access token): Simpler but broader scoped.

The client automatically exchanges credentials for a registration token + admin token behind the scenes and refreshes them before expiry.

You can find more details on required permissions in the GitHub Docs.

GitHub Enterprise Server (GHES) is supported out of the box—just use your GHES URL when creating the client.

GitHub App (recommended)

import { Client, GitHubAppAuth } from '@actions/scaleset';

const client = await Client.createWithGitHubApp({
  gitHubConfigURL: 'https://github.com/my-org/my-repo',
  gitHubAppAuth: new GitHubAppAuth({
    clientId: 'Iv1.abc123',
    installationId: 12345678,
    privateKey: '-----BEGIN RSA PRIVATE KEY-----\n...',
  }),
  systemInfo: { system: 'my-scaler', version: '1.0.0' },
});

Personal Access Token

import { Client } from '@actions/scaleset';

const client = await Client.createWithPersonalAccessToken({
  gitHubConfigURL: 'https://github.com/my-org/my-repo',
  personalAccessToken: 'ghp_...',
});

Client Options

Both factory methods accept optional functional options:

import { Client, withLogger, withRetryMax, withRetryWaitMax, withoutTLSVerify } from '@actions/scaleset';

const client = await Client.createWithPersonalAccessToken(
  { gitHubConfigURL: 'https://github.com/my-org/my-repo', personalAccessToken: 'ghp_...' },
  withLogger(console),       // enable logging
  withRetryMax(5),           // max retry attempts (default: 4)
  withRetryWaitMax(60000),   // max retry wait in ms (default: 30000)
  withoutTLSVerify(),        // disable TLS verification (not recommended for production)
);

API Reference

Scale Set Operations

| Method | Description | Returns | |--------|-------------|---------| | client.getRunnerScaleSet(runnerGroupId, name) | Get a scale set by runner group ID and name | object \| null | | client.getRunnerScaleSetById(id) | Get a scale set by its ID | object | | client.createRunnerScaleSet(scaleSet) | Create a new scale set | object | | client.updateRunnerScaleSet(id, updates) | Update an existing scale set | object | | client.deleteRunnerScaleSet(id) | Delete a scale set | void |

Runner Group Operations

| Method | Description | Returns | |--------|-------------|---------| | client.getRunnerGroupByName(name) | Get a runner group by name | object |

Runner Operations

| Method | Description | Returns | |--------|-------------|---------| | client.generateJitRunnerConfig(settings, scaleSetId) | Generate a JIT runner config | { encodedJITConfig, runner } | | client.getRunner(id) | Get a runner by ID | object | | client.getRunnerByName(name) | Get a runner by name | object \| null | | client.removeRunner(id) | Remove a runner | void |

Message Session

| Method | Description | Returns | |--------|-------------|---------| | client.createMessageSessionClient(scaleSetId, owner) | Create a message session client | MessageSessionClient | | sessionClient.getMessage(lastMessageId, maxCapacity) | Poll for the next message | object \| null | | sessionClient.deleteMessage(messageId) | Acknowledge and delete a message | void | | sessionClient.close() | Close the message session | void | | sessionClient.session() | Get current session info | object |

Example: Full Scale Set Lifecycle

import { Client, GitHubAppAuth, DEFAULT_RUNNER_GROUP } from '@actions/scaleset';

const client = await Client.createWithGitHubApp({
  gitHubConfigURL: 'https://github.com/my-org/my-repo',
  gitHubAppAuth: new GitHubAppAuth({
    clientId: 'Iv1.abc123',
    installationId: 12345678,
    privateKey: process.env.GITHUB_APP_PRIVATE_KEY,
  }),
});

// Get the default runner group
const runnerGroup = await client.getRunnerGroupByName(DEFAULT_RUNNER_GROUP);

// Create a scale set (or fetch existing)
let scaleSet = await client.getRunnerScaleSet(runnerGroup.id, 'my-scale-set');
if (!scaleSet) {
  scaleSet = await client.createRunnerScaleSet({
    name: 'my-scale-set',
    runnerGroupId: runnerGroup.id,
    labels: [{ name: 'my-scale-set' }],
    runnerSetting: { disableUpdate: true },
  });
}

// Generate a JIT runner config
const { encodedJITConfig, runner } = await client.generateJitRunnerConfig(
  { name: `runner-${Date.now()}` },
  scaleSet.id,
);

// Start your runner with the JIT config...
// The encodedJITConfig is passed to the runner binary via --jitconfig flag

Listener

The Listener class provides a high-level polling loop that handles session management, message polling, acknowledgment, and dispatching to your scaler.

import { Client, GitHubAppAuth, Listener } from '@actions/scaleset';

const client = await Client.createWithGitHubApp({ /* ... */ });
const scaleSet = await client.getRunnerScaleSetById(scaleSetId);
const sessionClient = await client.createMessageSessionClient(scaleSet.id, 'my-host');

const listener = new Listener(sessionClient, {
  scaleSetId: scaleSet.id,
  maxRunners: 10,
  logger: console,
});

// Implement the Scaler interface
const scaler = {
  async handleDesiredRunnerCount(count) {
    console.log(`Scaling to ${count} runners`);
    // Provision or remove runners to match the desired count
    return currentRunnerCount;
  },
  async handleJobStarted(jobInfo) {
    console.log(`Job started: ${jobInfo.jobDisplayName} on runner ${jobInfo.runnerName}`);
    // Mark runner as busy
  },
  async handleJobCompleted(jobInfo) {
    console.log(`Job completed: ${jobInfo.jobDisplayName} with result ${jobInfo.result}`);
    // Clean up the runner
  },
};

// Run until stopped
const ac = new AbortController();
process.on('SIGINT', () => ac.abort());

try {
  await listener.run(scaler, { signal: ac.signal });
} catch (err) {
  if (err.message !== 'aborted') {
    console.error('Listener error:', err);
  }
}

await sessionClient.close();

Scaler Interface

Your scaler must implement three methods:

| Method | Description | |--------|-------------| | handleDesiredRunnerCount(count) | Called with the desired number of runners. Provision or remove runners to match. Returns the current count. | | handleJobStarted(jobInfo) | Called when a job starts on a runner. Use for marking runners busy. | | handleJobCompleted(jobInfo) | Called when a job completes. Use for runner cleanup. |

Error Handling

The package provides a structured error hierarchy:

ScalesetError (base class)
├── RunnerNotFoundError      – Runner does not exist
├── RunnerExistsError         – Runner already exists (name conflict)
├── JobStillRunningError      – Cannot remove runner with active job
└── MessageQueueTokenExpiredError – Session token expired (handled internally)

ActionsExceptionError         – Wraps API exception responses

Errors are automatically created from API responses. Use instanceof for error handling:

import {
  RunnerNotFoundError,
  RunnerExistsError,
  JobStillRunningError,
} from '@actions/scaleset';

try {
  await client.removeRunner(runnerId);
} catch (err) {
  if (err instanceof RunnerNotFoundError) {
    console.log('Runner already removed');
  } else if (err instanceof JobStillRunningError) {
    console.log('Runner is busy, will retry later');
  } else {
    throw err;
  }
}

Config Utilities

| Export | Description | |--------|-------------| | parseGitHubConfigFromURL(url) | Parse a GitHub URL into a config object with scope, org, repo, etc. | | gitHubAPIURL(config, path) | Build a GitHub API URL (handles github.com vs GHES) | | isHostedGitHubURL(url) | Check if a URL is hosted GitHub (not GHES) | | GitHubScope | Enum: Unknown, Enterprise, Organization, Repository | | InvalidGitHubConfigURLError | Thrown for malformed config URLs |

Security Notes

Always prefer GitHub App credentials; rotate PATs if you must use them.
Treat JIT configs as secrets until consumed.

Requirements

Node.js 18 or later
ESM only ("type": "module" in your package.json)

The only external dependency is jsonwebtoken. HTTP requests use the Node.js built-in fetch API.

License

This project is licensed under the terms of the MIT open source license. Please refer to LICENSE for the full terms.