@dotcom-reliability-kit/client-metrics-web

v1.0.0

Published

20 days ago

A client for sending operational metrics to client metrics server from the web

Downloads

3,966

0High
0Medium
0Low

@dotcom-reliability-kit/client-metrics-web

A lightweight client for sending operational metrics events from the browser to the Customer Products Client Metrics Server. This module is part of FT.com Reliability Kit.

[!WARNING] This Reliability Kit module is intended for use in client-side JavaScript. Importing it into a Node.js app will result in errors being thrown.

[!IMPORTANT] Remember that this library is intended for sending operational metrics - metrics that help us understand whether your system is operating as expected. For analytics data you should still be using Spoor (or another solution common to your team).

Usage (systems)

Systems that want to send metrics from the client should import and construct a metrics client as part of their client-side JavaScript. If you're writing code that's shared across multiple systems do not import and construct a metrics client. See the usage guide for shared libraries.

Install @dotcom-reliability-kit/client-metrics-web as a dependency:

npm install --save @dotcom-reliability-kit/client-metrics-web

Include in your client-side code:

import { MetricsClient } from '@dotcom-reliability-kit/client-metrics-web';
// or
const { MetricsClient } = require('@dotcom-reliability-kit/client-metrics-web');

`MetricsClient`

The MetricsClient sends events to the Client Metrics Server, using POST to the following endpoint /api/v1/ingest.

The events are added to a batch. When the batch reaches its max capacity, which defaults to 20, the client sends the events to the server. If it does not reach that max capacity, the client sends the events every 10 seconds. This retentionPeriod is a configurable option (see options.retentionPeriod). If there are no events added to the batch, nothing is sent.

The correct environment (production vs test) is automatically selected based on the browser hostname. If your hostname has test, staging or local, the events will be sent to our test server. Else, it will send the metrics to the production server.

The class should only ever be constructed once or you'll end up sending duplicate metrics. You should construct the metrics client as early as possible in the loading of the page. For the required options, see configuration options.

const client = new MetricsClient({
    // Options go here
});

This will bind some global event handlers:

ft.clientMetric - records a custom metric based on details found in the event (see documentation below)

`client.recordEvent()`

Send a metrics event:

client.recordEvent('namespace.event', {
    // Any event details you want to send can go here as key/value pairs
});

The event namespace must be comprised of alphanumeric characters, underscores, and hyphens, possibly separated by periods. Other than the above, the event namespace is free-form for now. A later major version of the client may lock down the top-level namespace further.

`client.enable()`

Enable the client. This is called by default during instantiation but you may need to call this if the client is ever disabled.

client.enable();

`client.disable()`

Disable the client, preventing any metrics from being sent. Global event handlers are also unbound.

client.disable();

`client.isEnabled`

A boolean indicating whether the client is currently enabled.

`client.isAvailable`

A boolean indicating whether the client was correctly configured and set up. If this is false then it's not possible for events to be sent to AWS CloudWatch RUM.

Event-based API

Passing around a single client instance may not be easy or preferable in your system, depending on complexity. For this we also bind a listener on window for the ft.clientMetric event.

This allows you to send metric events from anywhere in your system, even if you don't have access to the client. You need to emit a custom event, either on window:

window.dispatchEvent(
    new CustomEvent('ft.clientMetric', {
        detail: {
            namespace: 'namespace.event',
            // Any event details you want to send can go here as key/value pairs
        }
    }
));

or from another element on the page, as long as you bubble the event:

const element = document.getElementById('my-component');
element.dispatchEvent(
    new CustomEvent('ft.clientMetric', {
        bubbles: true,
        detail: {
            namespace: 'namespace.event',
            // Any event details you want to send can go here as key/value pairs
        }
    }
));

When sending metrics in this way, the detail object can have any number of properties. These properties will be extracted into the final metrics event and sent to the client metrics server. For example:

window.dispatchEvent(
    new CustomEvent('ft.clientMetric', {
        detail: { namespace: 'snack.consumed', fruit: 'banana' }
    }
));

// is equivalent to:

client.recordEvent('snack.consumed', { fruit: 'banana' });

Error handling

If something goes wrong with the configuration of the metrics client or the sending of an event, we don't throw an error - this could result in an infinite loop where we try to record the thrown error and that throws another error.

Instead of throwing an error, we log a warning to the console. If you're not seeing metrics in AWS then check the browser console for these warnings. It's also a good idea to check for them in local development before pushing any changes.

Configuration options

Config options can be passed into the MetricsClient function as an object with any of the keys below.

new MetricsClient({
    // Config options go here
});

`options.systemCode`

Required String. The Biz Ops system code you're sending metrics for.

new MetricsClient({ systemCode: 'my-system' });

The system code has to be a combination of alphanumerical characters, possibly separated by hyphens.

[!IMPORTANT] If the systemCode is not set properly, the client will fail to be constructed and it will not be possible to send any events. Attempting to use recordEvent will fail and log a warning Client not initialised properly, cannot record an event.

`options.systemVersion`

Optional String. The version number of the currently running system, which helps us to spot issues in new versions. This could be a version number or a git commit hash. Defaults to 0.0.0.

new MetricsClient({ systemVersion: '1.2.3' });

`options.environment`

Optional Enum. Can be production or test. This is used to defined what server to send the metrics to. If not set, the client will use the hostname to best guess where to send the metrics. If your hostname has test, staging or local, the test server will be used. Else, we will send the metrics to production. If this does not work for your use case, use the environment option to override that behaviour.

new MetricsClient({ environment: 'production' });

`options.batchSize`

Optional number. This is how many events can be accumulated in the queue before the client send them to the server. The minimum size is 20 events, and if you try to set batchSize to a lower number, it will be ignore and the options will be set to be 20 events.

`options.retentionPeriod`

Optional number. This is the amount of seconds the client will wait before sending a batch of events. The minimum wait time is 10 seconds, and if you try to set retentionPeriod to a lower number, it will be ignored and the client will use 10 seconds.

new MetricsClient({ retentionPeriod: 15 });

[!NOTE] If the batch of event reaches its maximum size, the events are sent before reaching the end of the retentionPeriod.

`options.queue`

Optional Queue. Events are batched into a queue before being sent. By default, the client will use an InMemoryQueue. You can provide your own queue implementation (as long as it extends the base Queue class).

import { InMemoryQueue, MetricsClient } from '@dotcom-reliability-kit/client-metrics-web';

const myQueue = new InMemoryQueue({ capacity: 11 });
new MetricsClient({ queue: myQueue });

Queue interface

The Queue class defines the contract that all queue implementations must follow.

import { Queue } from '@dotcom-reliability-kit/client-metrics-web';

new Queue({ capacity: 11 })

Options

capacity (number, optional): Maximum number of items the queue can hold. The default is 10 000.

[!NOTE] The choice of 10000 events is calculated from the following: We want the batch of events to be 10MB at max. A heavy event is about 750 bytes.

Required methods

Subclasses must implement the following methods:

add(item): Add an item to the queue.
drop(count): Drop the oldest item(s) from the queue. The default is 1.
pull(count): Remove and return the first count items.
size (getter): Current number of items in the queue.
capacity (getter): Maximum queue capacity (implemented in base class).

Usage (shared libraries)

If you want to record operational metrics from another library that's shared between systems, do not install this module as a dependency. Instead you should rely solely on the event-based API.

This ensures that:

We don't end up with duplicate global event handlers
The system installing your library does not have to inject the metrics client and manage the instance
We don't end up with systems and libraries using different incompatible versions of the metrics client (dependency hell)

We recommend that your library decides on a top-level namespace that all other events live under. E.g.

my-library.success
my-library.failure

Usage (infrastructure)

As well as a client you'll need to add a namespace on the Client Metrics Server to send events to. You can set this up yourselves following the script in the repo. This should set you up to send logs to Splunk and metrics to CloudWatch. You can then use your CloudWatch metrics to create graphs and alerts in Grafana.

Contributing

See the central contributing guide for Reliability Kit.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@dotcom-reliability-kit/client-metrics-web

Usage (systems)

MetricsClient

client.recordEvent()

client.enable()

client.disable()

client.isEnabled

client.isAvailable

Event-based API

Error handling

Configuration options

options.systemCode

options.systemVersion

options.environment

options.batchSize

options.retentionPeriod

options.queue

Queue interface

Options

Required methods

Usage (shared libraries)

Usage (infrastructure)

Contributing

License

`MetricsClient`

`client.recordEvent()`

`client.enable()`

`client.disable()`

`client.isEnabled`

`client.isAvailable`

`options.systemCode`

`options.systemVersion`

`options.environment`

`options.batchSize`

`options.retentionPeriod`

`options.queue`