fpt-battery

v0.6.1

Published

5 days ago

Forecasting Proficiency Test - A jspsych-based experiment battery

0High
0Medium
0Low

jspsych forecasting psychology experiment

This repository is a direct consequence of the massive Forecasting Proficiency Test project, ran by the Forecasting Research Institute.

One of the central goals of the project was to find the cognitive skills that underpin forecasting ability. In that effort, we implemented a bunch of cognitive task, using jsPsych (v7). With this repository, we share these tasks in a ready-to-use form so others can reuse them in a plug-and-play manner.

Quick start

This package and its API are inspired by jsPsych. Their documentation provides a helpful starting place.

Option 1: CDN

<!DOCTYPE html>
<html lang="en">
    <head>
        <title>My Experiment</title>
        <script src="https://unpkg.com/fpt-battery@latest"></script>
        <link rel="stylesheet" href="https://unpkg.com/fpt-battery@latest/dist/fpt-battery.css" />
    </head>
    <body></body>
    <script>
        const config = {
            tasks: [{ task_name: 'number_series', custom_task_settings: {} }]
        };
        const fpt_battery = initFPTBattery(config);
        fpt_battery.run();
    </script>
</html>

Option 2: NPM

Run npm install fpt-battery.

Similar to jsPsych, the markup required is a body element in the HTML document. For the CSS either import 'fpt-battery/css' or a link tag in the HTML document's head, <link rel="stylesheet" href="https://unpkg.com/fpt-battery@latest/dist/fpt-battery.css" />

import initFPTBattery from 'fpt-battery';
import 'fpt-battery/css'; // or a <script> tag in your preceding HTML

const config = {
    tasks: [{ task_name: 'number_series', custom_task_settings: {} }]
};
const fpt_battery = initFPTBattery(config);
fpt_battery.run();

Hosting online

This package is frontend-only, so you need a backend and somewhere to store your data.

Jatos

Jatos (Just Another Tool for Online Studies) is a free and open-source solution to manage your experiments. You deploy it on your own server (a cheap VPS, your university etc) so that it complies with the data security policies you need. It is sponsored by the European Society for Cognitive Psychology (ESCoP).

Navigating Jatos itself can be steep at first, but they have extensive documentation. Integrating fpt-battery is also quite easy.

The quickest way to get start is to simply download jatos/fpt-battery.jzip from this repository and Import Study in Jatos. The .jzip is regenerated from sources in this repo on every release and pins the bundled index.html to the matching fpt-battery version on unpkg — see Building the JATOS .jzip for how it's produced.

To quickly demo this setup, use Jatos' demo server. Please make sure you use your own login (not the generic ones provided) as we've found uploading a study problematic. Once you are logged in, import the .jzip.

Github Pages + DataPipe + OSF

Host a small static site on Github Pages (free) with an index.html that loads fpt-battery from a CDN (as in the example above) plus any custom JS/CSS you need. Following Github's instructions and see this example repository.

NB Make sure you also download the assets folder from this repository and set media_basepath accordingly (usually your Github Pages site followed by assets, e.g. media_basepath: <github_username>.github.io/assets/) in the config. If you do not do this, our own hosted server will serve these files which might be slower depending on traffic, rate limits and geograhpical local of end-users.

If you want backendless data collection, DataPipe is a good fit: it sends participant data to OSF, but it does not host your experiment. The usual flow is:

Create an OSF project.
Link your OSF account in DataPipe.
Create a DataPipe experiment pointing at that OSF project.
Host your experiment separately on GitHub Pages, Netlify, university hosting, etc.
Add the generated DataPipe code snippet to your study for saving data.

Precise instructions from DataPipe can be found here.

To connect this to the fpt-battery to allow data saving you will need to adjust the data_saving function in the config. For an example, see here.

NB Note that fpt-battery saves data on certain checkpoints, instead of a single file at the end, as most jspsych-based backend solutions assume. The checkpointing allows more reliable data saving and to verify partial completions (in case of paid participants who encounter problems). This means that your backend, in this case OSF, will have to accommodate this - which OSF does. Though you'd need to recombine data at the end to get all trials from all checkpoints for individual users. Here you can find a Python utility script to help with that, and your favourite LLM can translate it to different langagues if needed.

Practical limits and caveats:

DataPipe has a 32 MB limit per request. This is usually plenty for jsPsych-style JSON/CSV data, but large binary uploads are more likely to hit it.
OSF Storage caps are 5 GB total for private projects/components and 50 GB total for public ones, with 5 GB max per individual file.
OSF visibility follows the receiving project's/component's privacy settings: private means only collaborators can see the data; public means anyone can.
Each OSF component gets its own storage cap, so using a dedicated data component can help keep things organized.
DataPipe normally does not store your data, but if OSF upload fails it will temporarily cache encrypted files and retry for about a week.
Data validation and session limits in DataPipe are worth enabling during data collection to reduce spam or malformed uploads.
If you care about storage region or compliance, choose the OSF storage location when creating the project; OSF says this cannot be changed later.

For current details, see the DataPipe FAQ, the DataPipe getting started guide, and OSF's docs on project storage limits and file uploads.

Configuration/Customization

Core battery settings

| Setting | Type / constraints | Default | Description | |---|---|---|---| | root_element | any | undefined | Where to render the experiment: a DOM element id (string), an Element instance, or undefined to use <body>. | | jsPsychOptions | object | {} | Forwarded to jsPsych's initJsPsych. display_element is always overridden by FPTBattery. | | session | object | see source | Persisted-session bag. Pass back the previous session's saved_progress to allow restarts; otherwise leave at the default. | | tasks | array | [] | Ordered list of { task_name, custom_task_settings } entries (see the table below for each task). Must be non-empty. | | data_saving | object | see source | Persistence hooks. Provide onSaveData(chunks, checkpoint, checkpointIndex) to upload trial data at checkpoints. | | disable_progressbar | boolean | false | Disable the custom FPT progress bar. | | show_inactivity_warning | boolean | true | Show the inactivity warning (fires after 5 consecutive trials ended by the custom FPT trial timer). | | ask_for_task_feedback | boolean | false | Show a free-text feedback prompt after each task. | | media_basepath | string | "https://fpt.quorumapp.com/static/fpt/img/" | Base URL for media assets. The default points at our hosted server; consider hosting assets/ yourself for performance. | | minutes_between_task_breaks | number≥ 0 | null | If set, offer an untimed break between tasks once this many minutes have elapsed since the last break. null disables breaks. | | skip_intro_trials | boolean | false | Skip the browser-check and welcome trials. |

Constraints:

tasks must contain at least one task

Tasks

You can define a series of a task by specifying the task's name and any custom settings. A full list of all available task names is:

[
  { task_name: 'number_series', custom_task_settings: {} },
  { task_name: 'leapfrog', custom_task_settings: {} },
  { task_name: 'cognitive_reflection', custom_task_settings: {} },
  { task_name: 'admc_framing', custom_task_settings: {} },
  { task_name: 'admc_decision_rules', custom_task_settings: {} },
  { task_name: 'admc_risk_perception', custom_task_settings: {} },
  { task_name: 'bayesian_update', custom_task_settings: {} },
  { task_name: 'coherence_forecasting', custom_task_settings: {} },
  { task_name: 'denominator_neglect', custom_task_settings: {} },
  { task_name: 'graph_literacy', custom_task_settings: {} },
  { task_name: 'impossible_question', custom_task_settings: {} },
  { task_name: 'raven_matrix', custom_task_settings: {} },
  { task_name: 'time_series', custom_task_settings: {} },
  { task_name: 'forecasting_questions', custom_task_settings: {} },
]

How overrides are applied. A task's custom_task_settings are deep-merged onto the defaults below: scalars and arrays replace the default, and object-valued settings merge key-by-key — except those marked replaces, not merged, where a supplied value replaces the default object wholesale (so a partial override defines the exact set instead of re-inheriting the default's keys).

`number_series`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | ns_items | array | see source | List of number-series items. Each item: { id: string, prompt: string, simulation_range: [min, max] }. | | ignore_validation | boolean | false | Disable input regex validation on the response field (used for simulation / admin flows). | | general_instructions_time_limit | integer≥ 0 | 120 | Seconds allowed for the general instructions screen. | | test_trial_time_limit | integer≥ 0 | 30 | Seconds allowed for each test trial. |

`leapfrog`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | general_instructions_time_limit | integer≥ 0 | 180 | Seconds allowed for the general instructions screen. | | pt_trials_instructions_time_limit | integer≥ 0 | 60 | Seconds allowed for the practice-trials instructions. | | pt_trial_interblock_time_limit | integer≥ 0 | 20 | Seconds allowed for the inter-block prediction question during practice. | | pt_trial_after_interblock_time_limit | integer≥ 0 | 20 | Seconds allowed for the spacebar-to-continue page after an inter-block prediction. | | test_trials_instructions_time_limit | integer≥ 0 | 90 | Seconds allowed for the test-trials instructions. | | volatility | number≥ 0, ≤ 1 | 0.125 | Probability that the option values 'leapfrog' on each trial (Bernoulli per trial). | | trial_time_limit | integer≥ 0 | 1500 | Ms allowed to respond on each binary-choice trial. | | points_counter | integer | 0 | Starting value of the accumulated-points counter (runtime-mutated; override only if you know why). | | pt_trials_n | integer≥ 1 | 80 | Total practice trials. Must be even and divisible by pt_trials_per_block; the protocol assumes exactly 4 practice blocks. | | pt_trials_per_block | integer≥ 1 | 20 | Practice trials per block. | | test_trials_n | integer≥ 1 | 200 | Total test trials. Must be divisible by test_trials_per_block. | | test_trials_per_block | integer≥ 1 | 40 | Test trials per block. |

Constraints:

pt_trials_n must be evenly divisible by pt_trials_per_block
pt_trials_n must be even (the practice protocol splits trials in half)
pt_trials_n / pt_trials_per_block must equal 4 (instructions and inter-block logic assume exactly 4 practice blocks)
test_trials_n must be evenly divisible by test_trials_per_block

`cognitive_reflection`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | crt_items | array | see source | Cognitive Reflection Test items. Each item: { id: string, prompt: string, options: string[] }. | | ignore_validation | boolean | false | If true, marks every question as not required (used for simulation / admin flows). | | test_trial_time_limit | integer≥ 0 | 280 | Seconds allowed for the (single) multi-question survey trial. |

`admc_framing`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | resistance_to_framing_items | array | see source | ADMC framing items grouped by subtask key (a1, a2, rc1, rc2). | | subtasks_to_include | arraylength ≥ 1 | ["a1","a2","rc1","rc2"] | Which framing subtasks to run, in order. Valid values: 'a1', 'a2', 'rc1', 'rc2'. | | ignore_validation | boolean | true | If true (the default), responses are not required (used for simulation / admin flows). | | subtask_instructions_time_limit | integer≥ 0 | 30 | Seconds allowed for each subtask's instructions screen. | | trial_time_limit | integer≥ 0 | 60 | Seconds allowed for each individual rating trial. |

`admc_decision_rules`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | dr_items | array | see source | Decision Rules items (TV-comparison tables). | | ignore_validation | boolean | false | If true, marks every question as not required (used for simulation / admin flows). | | instructions_time_limit | integer≥ 0 | 180 | Seconds allowed for the instructions screen. | | trial_time_limit | integer≥ 0 | 45 | Seconds allowed for each individual decision-rules trial. |

`admc_risk_perception`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | rp_items | array | see source | Risk Perception items. Each item asks the participant to estimate a probability via slider. | | ignore_validation | boolean | false | If true, skip the require-all-sliders-moved gate (used for simulation / admin flows). | | instructions_time_limit | integer≥ 0 | 60 | Seconds allowed for the instructions screen. | | trial_time_limit | integer≥ 0 | 210 | Seconds allowed for the (single) survey trial. |

`bayesian_update`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | general_instructions_time_limit | integer≥ 0 | 120 | Seconds allowed for the general instructions screen. | | pt_trials_instructions_time_limit | integer≥ 0 | 30 | Seconds allowed for the practice-trials instructions. | | test_trials_instructions_time_limit | integer≥ 0 | 60 | Seconds allowed for the test-trials instructions. | | between_unique_trials_time_limit | integer≥ 0 | 30 | Seconds allowed for the between-set explanation shown after each unique trial. | | test_trial_time_limit | integer≥ 0 | 20 | Seconds allowed for each individual test trial. | | task_version | enumone of: "easy", "hard" | "easy" | Difficulty mode: 'easy' asks which urn the draw came from; 'hard' asks for the probability the next ball is blue. | | draw_with_replacement | boolean | true | Whether sampling within a unique trial is with replacement. | | unique_test_trials_n | integer≥ 1 | 8 | Number of unique test trials per task version. Must be even and divisible by ball_splits.length. | | draws_per_unique_trial | integer≥ 1 | 5 | Number of draws shown per unique trial. | | balls_per_draw | integer≥ 1 | 1 | Number of balls drawn per draw event. | | ball_splits | arraylength ≥ 1 | [[40,60],[30,70]] | Pairs of urn compositions, e.g. [[40, 60], [30, 70]]. Each pair is a (minority, majority) split out of 100. | | task_version_parameters | object | see source | Per-version UI strings: { easy: { labels, slider_initial_text, question }, hard: {...} }. | | fade_in_animation_duration | integer≥ 0 | 1000 | Ms for the fade-in animation. Change the matching CSS if this changes. Also used as the box-opening animation duration. | | ball_move_duration | integer≥ 0 | 200 | Ms per ball when balls move between containers. | | wait_after_box_opening_to_draw_balls | integer≥ 0 | 500 | Ms to wait after the box opens before drawing balls. | | unique_pt_trials_n | integer≥ 1 | 2 | Number of unique practice trials. Must be even and divisible by ball_splits.length. | | pt_trials_n | integer≥ 1 | 10 | Total practice trials (unique_pt_trials_n * draws_per_unique_trial in normal use). Must be divisible by pt_trials_per_block. | | pt_trials_per_block | integer≥ 1 | 10 | Practice trials per block (for between-block breaks). | | test_trials_per_block | integer≥ 1 | 40 | Test trials per block (for between-block breaks). |

Constraints:

pt_trials_n must be evenly divisible by pt_trials_per_block
unique_test_trials_n * draws_per_unique_trial (the derived test_trials_n) must be evenly divisible by test_trials_per_block
unique_pt_trials_n must be even
unique_pt_trials_n must be evenly divisible by ball_splits.length
unique_test_trials_n must be even
unique_test_trials_n must be evenly divisible by ball_splits.length

`coherence_forecasting`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | cfs_items | array | see source | Coherence-forecasting statements. Each item: { id: string, block: string, statement: string }. Items are grouped into blocks (one trial per block). | | ignore_validation | boolean | false | If true, skip the require-all-sliders-moved gate (used for simulation / admin flows). | | general_instructions_time_limit | integer≥ 0 | 60 | Seconds allowed for the general instructions screen. | | block_timer_per_item | integer≥ 0 | 20 | Per-item time budget in seconds; each block's timer is items-in-block × this value. |

`denominator_neglect`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | general_instructions_time_limit | integer≥ 0 | 120 | Seconds allowed for the general instructions screen. | | pt_trials_instructions_time_limit | integer≥ 0 | 60 | Seconds allowed for the practice-trials instructions. | | test_trials_instructions_time_limit | integer≥ 0 | 45 | Seconds allowed for the test-trials instructions. | | main_trial_time_limit | integer≥ 0 | 15 | Seconds allowed per main (post-instruction) trial timer (separate from trial_time_limit). | | task_version | enumone of: "A", "B" | "A" | Variant: 'A' uses four small/large display combinations; 'B' uses a single 'both' combination. | | trial_choice_types | arraylength ≥ 1 | ["conflict","harmony"] | Which choice-type categories to include. Both 'conflict' and 'harmony' are expected by the protocol. | | small_lottery_gold_coin_props | arraylength ≥ 1 | [0.1,0.2,0.3] | Possible gold-coin proportions for the small lottery (values in [0, 1]). | | large_lottery_gold_coin_prop_range_diff | arraylength ≥ 1 | [0.01,0.02,0.03,0.04,0.05,0.06,0.07,0.08] | Possible differences between small and large lottery proportions (values in [0, 1]). | | denominator_display_types | object | see source | Display modes keyed by task_version. Each entry is an array of { small, large } combinations. | | lottery_total_coins | object | {"small":10,"large":400} | Total coin counts for the small and large lotteries. | | trial_time_limit | integer≥ 0 | 5000 | Ms allowed to respond on each individual trial. | | gold_coins_counter | integer | 0 | Starting value of the gold-coins counter (runtime-mutated). | | pt_trials_n | integer≥ 1 | 8 | Total practice trials. Must be divisible by pt_trials_per_block, be even, and (when task_version='A') be divisible by 8. | | pt_trials_per_block | integer≥ 1 | 8 | Practice trials per block. | | test_trials_n | integer≥ 1 | 48 | Total test trials. Must equal trial_choice_types.length × small_lottery_gold_coin_props.length × large_lottery_gold_coin_prop_range_diff.length, be divisible by test_trials_per_block, be even, and (when task_version='A') be divisible by 8. | | test_trials_per_block | integer≥ 1 | 48 | Test trials per block. |

Constraints:

pt_trials_n must be evenly divisible by pt_trials_per_block
pt_trials_n must be even
pt_trials_n must be divisible by 8 when task_version='A' (four display combinations × two choice types)
test_trials_n must be evenly divisible by test_trials_per_block
test_trials_n must be even
test_trials_n must be divisible by 8 when task_version='A'
test_trials_n must equal trial_choice_types.length × small_lottery_gold_coin_props.length × large_lottery_gold_coin_prop_range_diff.length (the cartesian product the protocol enumerates)

`graph_literacy`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | gl_content | array | undefined | List of graph-literacy items, either { content: 'info', html: string } or { content: 'question', ... }. Defaults to the bundled 13-item battery (see the task source); no static default because the bundled items bake in media_basepath. | | ignore_validation | boolean | false | If true, disable client-side form-required validation (used for simulation / admin flows). | | test_trial_time_limit | integer≥ 0 | 300 | Seconds allowed for the (single) survey trial. |

`impossible_question`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | general_instructions_time_limit | integer≥ 0 | 60 | Seconds allowed for the general instructions screen. | | test_trial_time_limit | integer≥ 0 | 20 | Seconds allowed for each test trial. | | test_trial_feedback_time_limit | integer≥ 0 | 10 | Seconds allowed for the per-trial feedback screen. | | use_anchor_version | boolean | false | If true, the participant always sees the 'anchor' form; otherwise a non-anchor form is sampled. | | gk_forms | objectreplaces, not merged | see source | Map from form id to general-knowledge question id list. | | iq_forms | objectreplaces, not merged | see source | Map from form id to impossible-question id list. | | general_knowledge_questions | array | see source | General-knowledge questions. Each item has at least { id, prompt, response_opts }. | | impossible_questions | array | see source | Impossible questions. Each item has at least { id, prompt, response_opts }. | | test_trials_n | integer≥ 1 | 30 | Total test trials. Must be divisible by test_trials_per_block. | | test_trials_per_block | integer≥ 1 | 30 | Test trials per block. | | ignore_validation | boolean | false | If true, marks every question as not required (used for simulation / admin flows). |

Constraints:

test_trials_n must be evenly divisible by test_trials_per_block

`raven_matrix`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | pt_trials_n | integer≥ 1 | 5 | Total practice trials. Must be divisible by pt_trials_per_block. | | pt_trials_per_block | integer≥ 1 | 5 | Practice trials per block. | | test_trials_n | integer≥ 1 | 42 | Total test trials. Must be divisible by test_trials_per_block. | | test_trials_per_block | integer≥ 1 | 42 | Test trials per block. | | ignore_validation | boolean | false | If true, marks every question as not required (used for simulation / admin flows). | | general_instructions_time_limit | integer≥ 0 | 120 | Seconds allowed for the general instructions screen. | | pt_trials_instructions_time_limit | integer≥ 0 | 120 | Seconds allowed for the practice-trials instructions. | | test_trials_instructions_time_limit | integer≥ 0 | 120 | Seconds allowed for the test-trials instructions. | | test_trial_time_limit | integer≥ 0 | 30 | Seconds allowed for each test trial. | | stimuli_lists | array | see source | Per-participant stimuli lists (one is sampled at runtime). See the task source for the bundled lists. | | correct_answers | object | see source | Map from stimulus id to correct answer letter (used for scoring). |

Constraints:

pt_trials_n must be evenly divisible by pt_trials_per_block
test_trials_n must be evenly divisible by test_trials_per_block

`time_series`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | general_instructions_time_limit | integer≥ 0 | 180 | Seconds allowed for the general instructions screen. | | pt_trials_instructions_time_limit | integer≥ 0 | 20 | Seconds allowed for the practice-trials instructions. | | test_trials_instructions_time_limit | integer≥ 0 | 30 | Seconds allowed for the test-trials instructions. | | test_trial_time_limit | integer≥ 0 | 20 | Seconds allowed for each test trial. | | ignore_validation | boolean | false | If true, marks every question as not required (used for simulation / admin flows). | | SHOW_CONDITIONS_DURING_TRIAL | boolean | false | Debug flag: render the trial's condition labels (function/direction/noise/datapoints) on screen. | | FUNCTIONS | arraylength ≥ 1 | ["linear","exponential"] | Functional forms used to generate the time series. Each entry is one condition factor. | | DIRECTIONS | arraylength ≥ 1 | ["positive","negative"] | Trend directions to include as a condition factor. | | datapoints | arraylength ≥ 1 | ["datapoints_10","datapoints_30"] | Data-point-count categories to include as a condition factor. | | NOISE | objectreplaces, not merged | {"low":1.9,"high":6.2} | Noise levels as a percentage of chart height (e.g. 650px chart → 1.9 ≈ 12px). Object keys index a condition factor. | | graphs_per_condition | integer≥ 1 | 1 | How many graphs to generate per condition (FUNCTIONS × DIRECTIONS × NOISE × datapoints combination). | | pt_trials_n | integer≥ 1 | 4 | Total practice trials. Must be divisible by pt_trials_per_block. | | pt_trials_per_block | integer≥ 1 | 4 | Practice trials per block. |

Constraints:

pt_trials_n must be evenly divisible by pt_trials_per_block

`forecasting_questions`

| Setting | Type / constraints | Default | Description | |---|---|---|---| | skip_instructions | boolean | false | If true, skip the introductory instructions screen. | | question_ids_to_show | array | [] | Ordered list of question ids (as strings) to present. Empty by default; populate with ids from get_all_questions(). | | date_start | string | "01 June 2025" | Start date shown in instructions and prompts. | | date_end | string | "30 June 2025" | End date shown in instructions and prompts. | | completed_checkpoints | array | [] | Injected by FPTBattery from session.saved_progress.data_checkpoints; skips questions already completed. | | task_order_index | integer≥ 0 | 0 | Injected by FPTBattery as the position of this task in the battery's tasks array; used to build checkpoint keys. |

Data saving

Provide your own onSaveData in data_saving config to persist trial data. It runs asynchronously in the background at checkpoints (e.g. after welcome, after each task).

data_saving: {
  disable_chunking: false,   // if true, data is passed as a single chunk
  maxChunkSize: 1000000,     // max bytes per chunk when chunking (default 1MB)
  onSaveData: async (chunks, checkpoint, checkpointIndex) => {
    // chunks: array of trial-data arrays (chunked by maxChunkSize)
    // checkpoint: string (e.g. 'experiment__welcome', 'task_name__block_0')
    // checkpointIndex: number
    for (const chunk of chunks) {
      await fetch('/your-save-endpoint', {
        method: 'POST',
        body: JSON.stringify({ checkpoint, checkpointIndex, data: chunk }),
      });
    }
  },
  onError: (err) => console.error('Save failed:', err),
}

Session restart

To resume a returning participant, pass the prior session's checkpoint list back via session.saved_progress. The battery emits a named checkpoint at each save point — your onSaveData is what persists them; on a fresh page load, you reconstruct the list from your store and hand it back.

const config = {
    session: {
        saved_progress: {
            data_checkpoints: ['experiment__save_session_params', 'experiment__welcome', /* ... */],
            last_data_checkpoint_ind: 7,  // highest checkpointIndex seen for this participant
        }
    },
    tasks: [/* same shape and order as the original */],
    data_saving: { onSaveData: /* ... */ },
};

Checkpoint names emitted by the battery:

| Checkpoint | When | |---|---| | experiment__save_session_params | First save — snapshots the resolved config. | | experiment__welcome | After welcome instructions. | | task_{N}_{taskName}__completed | After the task at position N in tasks finishes. | | experiment__completed | End of the battery. | | browser_exclusion | Only if the browser check fails. |

forecasting_questions additionally emits task_{N}_forecasting_questions_start, task_{N}_forecasting_questions_instructions_completed, and one task_{N}_forecasting_questions_question_{questionId}_completed per question.

Granularity. Every built-in task except forecasting_questions is checkpointed only at the task boundary — a mid-task refresh re-runs the full task. To opt a custom task into finer granularity, accept completed_checkpoints + task_order_index in its settings schema and interleave the async_data_save_trial_generator it receives in get_timeline(hide_progress_bar, async_data_save_trial_generator). See src/tasks/forecasting_questions.js for the pattern.

Caveats.

Keep tasks in the same order across resumes — checkpoint names embed task_order_index, so reordering or inserting a task invalidates every later task's checkpoint.
last_data_checkpoint_ind is the highest checkpointIndex your onSaveData has seen; it lets the battery keep checkpoint indices monotonic across sessions. Leave it at 0 for a fresh participant.
The browser check re-runs on every resume; the welcome screen does not.
Data uploads are incremental — each checkpoint sends only the trials since the previous save. Recombine on the analysis side (see combine_datapipe_sessions.py for an OSF example).

Customize welcome/debrief trials (as well as other trial generators)

The FPTBattery class initializes trial generators, which are functions that return a jsPsych trial. The respective trial generators can be accessed and overridden in your code like so:

const config = {...};
const fpt_battery = initFPTBattery(config);
fpt_battery.trial_generators.welcome = function() {
    return {
        // jspsych trial
    }
}
fpt_battery.trial_generators.debrief = function() {
    return {
        // jspsych trial
    }
}
// whenever you are ready:
fpt_battery.run()

Note that the definition of the jsPsych's trial type param depends on your import method. See jsPsych docs for more information. NB If you're importing from a CDN/local, make sure your script tags import both jsPsych as well as the plugin you're going to be using.

You can also modify other trial generators. You are welcome to inspect the class source code - it has a get_trial_generators method that initializes all of those

Extending or replacing a task

Each built-in task is a class. You can subclass it to tweak behavior, or replace it entirely with your own implementation.

The classes are exposed statically on initFPTBattery.tasks (same access from npm and CDN). The live registry the battery reads from is exposed on the instance as task_registry — mutate it between initFPTBattery(config) and .run():

import initFPTBattery from 'fpt-battery';

class TrimmedBayesianUpdate extends initFPTBattery.tasks.BayesianUpdate {
    constructor(name, media_basepath, jsPsych, custom_task_settings) {
        super(name, media_basepath, jsPsych, custom_task_settings);
        // your customizations — mutate this.task_data, this.settings, etc.
    }
}

const config = { tasks: [{ task_name: 'bayesian_update', custom_task_settings: {} }] };
const fpt_battery = initFPTBattery(config);
fpt_battery.task_registry.bayesian_update = TrimmedBayesianUpdate;
fpt_battery.run();

If your subclass needs to accept new settings, extend the parent's schema. Task settings are validated against a static get_settings_schema() on each task class, and strict validation rejects any unknown key in custom_task_settings. Spread the parent's schema (and invariants, if you need cross-field checks) and add your own descriptors:

class TrimmedBayesianUpdate extends initFPTBattery.tasks.BayesianUpdate {
    static get_settings_schema() {
        return {
            ...super.get_settings_schema(),
            drop_unique_trials_with_ball_splits: {
                type: 'array',
                default: [],
                description: 'Pairs of ball_split values; one matching unique trial is dropped per spec.',
            },
        };
    }

    // Optionally extend cross-field invariants the same way:
    static get_settings_invariants() {
        return [
            ...super.get_settings_invariants(),
            // { check: (s) => ..., message: '...', fields: [...] },
        ];
    }

    constructor(name, media_basepath, jsPsych, custom_task_settings) {
        super(name, media_basepath, jsPsych, custom_task_settings);
        // safe to read this.settings.drop_unique_trials_with_ball_splits here
    }
}

Both get_settings_schema() and get_settings_invariants() are static. The parent's constructor calls this.constructor.get_settings_schema() (not the parent class directly), so JS's static-method dispatch picks up the subclass's override.

From a CDN, initFPTBattery.tasks hangs off the global initFPTBattery the same way.

Class names are PascalCase versions of the task_name strings: bayesian_update → BayesianUpdate, number_series → NumberSeries, and so on. See src/tasks for each class's source.

Simulation

fpt-battery can run an entire battery without a participant, auto-generating a response for every trial. This powers quick local debugging and the project's automated tests. It is built directly on jsPsych's simulation mode, so the interface mirrors jsPsych's own simulate().

`battery.simulate(mode, options)`

simulate() is the simulation counterpart of run() — build the battery the same way, then call simulate() instead of run():

const battery = initFPTBattery(config);

await battery.simulate('data-only');            // headless: fabricate data, render nothing
await battery.simulate('visual');               // render the UI and auto-drive it (the default)
await battery.simulate('visual', { /* … */ });  // …with per-trial / per-task overrides

| Param | Default | Meaning | |---|---|---| | mode | "visual" | "data-only" fabricates each trial's data with no rendering — fast and headless. "visual" renders the UI and auto-acts on it (canvas tasks need a real browser). | | options | {} | Per-trial / per-task simulation overrides (see below). Mirrors jsPsych's native simulation_options map. |

Overriding individual trials and tasks

options is keyed like jsPsych's simulation_options, with one fpt-battery addition (task-name keys). A key is one of three kinds:

A trial-name key — e.g. "number_series_test_trial" — is a free-form jsPsych simulation-options entry: { mode, simulate, data } (the response time lives in data.rt). It is merged over the task's built-in default for that trial. See the catalog below for every valid key.
A task-name key — e.g. "number_series" — is a directive applied to every trial of that task. Only simulate and rt are read (rt is sugar for data.rt); any other field is ignored.
default is passed straight through to jsPsych, which deep-merges it into every trial (the trial's own values win). It is the slot for a global rt.

simulate / rt resolve by precedence, highest first: per-trial → per-task → default.

await battery.simulate('data-only', {
  default: { data: { rt: 500 } },                   // every trial responds in 500 ms…
  leapfrog: { rt: 1200 },                            // …except leapfrog's trials (1200 ms)…
  number_series_test_trial: { data: { rt: 800 } },   // …except this one trial (800 ms).
});

Set simulate: false on a trial-name or task-name key to run that part for real instead of simulating it. This is honoured in visual mode; data-only always simulates (forcing simulate: true and warning), because a real trial would hang headless.

Global-rt gotcha. A top-level rt on default is ignored — jsPsych only reads data from a simulation-options entry. Use default: { data: { rt: 500 } }, not default: { rt: 500 }. (The task-name rt key is fpt-battery's own sugar, so { leapfrog: { rt: 500 } } works directly.)

Available `simulation_options` keys

Every simulatable trial carries a simulation_options identifier — these are the exact trial-name keys you can target in options. A key marked with a built-in simulated data payload already fabricates that field (you can still override it); an unmarked key falls back to the underlying plugin's own default simulation. Every task name is additionally valid as a directive key ({ simulate, rt }); the battery's own framework trials (welcome, debrief, data-save, etc.) carry battery_-prefixed keys.

Generated by npm run docs:sim from each task's built timeline and get_default_simulation_options() — do not edit by hand.

`number_series`

| simulation_options key | Built-in simulated data | |---|---| | number_series_general_instructions | — | | number_series_test_trial | response (function) |

`leapfrog`

| simulation_options key | Built-in simulated data | |---|---| | leapfrog_general_instructions | — | | leapfrog_pt_trial | — | | leapfrog_pt_trial_after_interblock | — | | leapfrog_pt_trial_interblock_question | responses (function) | | leapfrog_pt_trials_instructions | — | | leapfrog_test_trial | — | | leapfrog_test_trials_instructions | — | | leapfrog_trial_feedback | — |

`cognitive_reflection`

| simulation_options key | Built-in simulated data | |---|---| | cognitive_reflection_test_trial | — |

`admc_framing`

| simulation_options key | Built-in simulated data | |---|---| | admc_framing_a1_instructions | — | | admc_framing_a2_instructions | — | | admc_framing_rc1_instructions | — | | admc_framing_rc2_instructions | — | | admc_framing_test_trial | — |

`admc_decision_rules`

| simulation_options key | Built-in simulated data | |---|---| | admc_decision_rules_general_instructions | — | | admc_decision_rules_test_trial | — |

`admc_risk_perception`

| simulation_options key | Built-in simulated data | |---|---| | admc_risk_perception_general_instructions | — | | admc_risk_perception_test_trial_a | valid_responses_n (function) | | admc_risk_perception_test_trial_b | valid_responses_n (function) |

`bayesian_update`

| simulation_options key | Built-in simulated data | |---|---| | bayesian_update_between_unique_trials_trial | — | | bayesian_update_general_instructions | — | | bayesian_update_pt_trials_instructions | — | | bayesian_update_test_trial | — | | bayesian_update_test_trials_instructions | — |

`coherence_forecasting`

| simulation_options key | Built-in simulated data | |---|---| | coherence_forecasting_general_instructions | — | | coherence_forecasting_test_trial | — |

`denominator_neglect`

| simulation_options key | Built-in simulated data | |---|---| | denominator_neglect_fixation_cross | — | | denominator_neglect_general_instructions | — | | denominator_neglect_pt_trials_instructions | — | | denominator_neglect_test_trial | — | | denominator_neglect_test_trial_feedback | — | | denominator_neglect_test_trials_instructions | — |

`graph_literacy`

| simulation_options key | Built-in simulated data | |---|---| | graph_literacy_test_trial | valid_text_responses_n (number), valid_radio_responses_n (number) |

`impossible_question`

| simulation_options key | Built-in simulated data | |---|---| | impossible_question_general_instructions | — | | impossible_question_test_trial | valid_responses_prop (number) | | impossible_question_test_trial_feedback | — |

`raven_matrix`

| simulation_options key | Built-in simulated data | |---|---| | raven_matrix_general_instructions | — | | raven_matrix_pt_trials_instructions | — | | raven_matrix_test_trial | — | | raven_matrix_test_trials_instructions | — |

`time_series`

| simulation_options key | Built-in simulated data | |---|---| | time_series_general_instructions | — | | time_series_pt_trials_instructions | — | | time_series_test_trial | — | | time_series_test_trials_instructions | — |

`forecasting_questions`

| simulation_options key | Built-in simulated data | |---|---| | forecasting_questions_instructions_confirmation | response (number) | | forecasting_questions_test_trial | responses (object) |

Simulating a custom task

If you add or replace a task and want it to join the data-only pass, follow the one invariant and the small contract the built-in tasks use:

No DOM in data-only hooks. jsPsych calls on_start / on_load / on_finish in both simulation modes, but data-only renders nothing, so any document.querySelector(...) in a hook throws. Guard every DOM-touching hook with an early return when this.jsPsych?.simulation_mode === "data-only", and gate visual-only autofill on this.jsPsych?.simulation_mode != null. Under run() the mode is null, so production is unaffected.
Tag each simulatable trial with a simulation_options string of the form <task.name>_<suffix> (e.g. simulation_options: `${this.name}_test_trial`) so the battery — and your overrides — can address it.
Declare per-trial defaults with an optional get_default_simulation_options() returning a map of <trial-name key> → { data: { … } }. Put response content there (never rt), and make a value a function when it must be re-evaluated per trial:
```
get_default_simulation_options() {
  return {
    [`${this.name}_test_trial`]: {
      data: { response: () => ({ [this.jsPsych.timelineVariable('id')]: '42' }) },
    },
  };
}
```
Bespoke widgets (canvas, heavy animation) that a stock plugin can't drive get their own jsPsych plugin that owns both simulation paths in one file: a trial() that renders for real, and a simulate(trial, mode, options, on_load) that fabricates the data with no DOM under data-only and drives the rendered widget under visual. This is how time_series (the time-series-chart plugin) and bayesian_update (bayesian-ball) work; both classes are exposed on initFPTBattery.plugins (mirroring initFPTBattery.tasks) so you can reuse or subclass them.
Add a shape contract in the data-only test harness (see tests/) so a future regression fails CI.

Development

Local dev

npm install
Create a demo/ directory with an index.html and index.js:

demo/index.html

<!DOCTYPE html>
<html lang="en">
    <head>
        <title>FPT Battery Demo</title>
    </head>
    <body></body>
    <script type="module" src="index.js"></script>
</html>

demo/index.js

import initFPTBattery from '../src/index.js';

const config = {
    tasks: [
        { task_name: 'number_series', custom_task_settings: {} },
    ],
    media_basepath: './assets/',
    skip_intro_trials: true,
};

const fpt_battery = initFPTBattery(config);
fpt_battery.run();

npm run dev — symlinks assets into demo/ and starts a Vite dev server on localhost:3001 with hot reload
Edit demo/index.js to test different tasks and configurations

Testing

NB: Tests and testing-related functionality is heavily developed by AI alone.

The suite combines Vitest (source/API tests under JSDOM) with Playwright (one browser smoke against the built UMD bundle). Tests live under tests/.

npm test — fast source-only Vitest run. Covers the public API surface, customization hooks (task_registry, trial_generators, subclassing), every task's constructor, deepMerge semantics, and the DataSaver callback contract. No build required.
npm run test:dist — builds the package, then runs:
- Dist exports check (tests/dist-exports.test.js) — every path declared in package.json exports exists in dist/, and the built ES bundle exposes the same public shape as the source.
- Playwright smoke (tests/browser/smoke.spec.js) — loads a fixture HTML that consumes the UMD bundle via <script> and the CSS via <link> (the same way unpkg/CDN consumers do), asserts the initFPTBattery global is exposed, the stylesheet returns 200, and a minimal cognitive_reflection battery renders its first trial.

Playwright needs its Chromium binary; the first run will install automatically via npx playwright install --with-deps chromium. CI does this explicitly.

Both pipelines run on every PR and push to main via .github/workflows/test.yml. The publish workflow also runs them before npm publish, so a broken build or regression cannot reach npm.

Linting & formatting

We use Biome for both linting and formatting. Config lives in biome.json and scope is limited to src/.

npm run format — apply formatting fixes
npm run lint — run the linter (no fixes applied)
npm run check — run lint + format checks together; exits non-zero on any error

npm run check is the gate. It runs locally via prepublishOnly before npm publish, and in CI before the build step — see below.

Versioning & commit messages

We follow Semantic Versioning for releases and Conventional Commits for commit messages.

Semantic versioning

The version in package.json is MAJOR.MINOR.PATCH:

MAJOR — incompatible API changes (e.g. removing a task, renaming a config field, changing the shape of onSaveData arguments).
MINOR — new functionality in a backwards-compatible way (e.g. adding a new task, adding a new optional config field).
PATCH — backwards-compatible bug fixes and internal-only changes (e.g. fixing a trial bug, dependency bumps that do not change behaviour, doc-only changes).

While the package is pre-1.0.0, MINOR bumps may occasionally include breaking changes — but try to avoid this. When in doubt, bump higher.

Conventional commits

Each commit message starts with a type, a colon, and a short imperative description. No scope. Example:

feat: add denominator_neglect task
fix: handle empty response in bayesian_update

Breaking changes are marked with a ! after the type, and explained in the body:

feat!: rename onSaveData argument order

BREAKING CHANGE: chunks and checkpoint arguments are swapped.

Commonly used types:

| Type | When to use | Typical bump | |------------|------------------------------------------------------------------|--------------| | feat | A new feature (new task, new config option, new public API) | MINOR | | fix | A bug fix in existing behaviour | PATCH | | docs | Documentation-only changes (README, comments, JSDoc) | PATCH or none | | style | Formatting, whitespace, biome auto-fixes — no logic changes | none | | refactor | Code change that neither fixes a bug nor adds a feature | PATCH | | perf | Performance improvement | PATCH | | test | Adding or fixing tests | none | | build | Build system or dependency changes (Vite, package.json deps) | PATCH | | ci | CI configuration changes (.github/workflows/*) | none | | chore | Other maintenance that doesn't fit above (e.g. version bumps) | none |

Any commit with ! or a BREAKING CHANGE: footer implies a MAJOR bump (or a MINOR bump pre-1.0).

Publishing a new version

Run npm run check to verify lint and formatting pass
Run npm test to verify the source-level tests pass
Run npm run test:dist to verify the build + dist exports + browser smoke pass
Run scripts/release.sh <version> (e.g. scripts/release.sh 0.3.0). It runs npm version --no-git-tag-version to bump both package.json and package-lock.json in lockstep, then commits, tags, and pushes the commit and tag separately.

Pushing a v* tag triggers the GitHub Actions workflow, which runs npm run check, npm test, builds, runs npm run test:dist, and publishes to npm automatically. If any of those steps fail, the workflow stops before publishing — so a tag push on a dirty or regressed branch will not reach npm. The tag must be pushed separately from the commit, otherwise GitHub may not trigger the workflow — scripts/release.sh handles this.

Note: The release commit should contain only the version bump, not other code changes — stage and commit anything else before running the script.

Assets

If a new asset is added, must add it to the hosted path - see media_basepath config.

Building the JATOS .jzip

jatos/fpt-battery.jzip is the JATOS-importable study bundle. It is generated, not hand-edited, and is rebuilt deterministically by scripts/build-jzip.mjs from three sources:

jatos/study.json — stable study/component/batch UUIDs and titles. These match the UUIDs in the currently shipped .jzip, so re-importing newer builds updates the same JATOS study rather than creating a duplicate. Bump a UUID only if you mean to ship a brand-new study.
jatos/index.html.template — the runtime index.html that the JATOS component loads. Edit this when the public fpt-battery API, the data_saving chunk callback shape, or the jatos.appendResultData / jatos.endStudy wiring changes. The placeholder __FPT_BATTERY_VERSION__ is substituted with package.json version at build time so the bundled page loads https://unpkg.com/fpt-battery@<that-version> instead of @latest (downloaded .jzips won't silently break on a future breaking npm release).
assets/ — bundled under <study-uuid>/assets/fpt/img/ inside the zip, matching the template's media_basepath: './assets/fpt/img/'.

Commands:

npm run build:jzip — regenerate jatos/fpt-battery.jzip from the sources above. Output is byte-deterministic (fixed mtimes, sorted entries) so successive builds with the same inputs hash identically.
scripts/release.sh <version> runs npm run build:jzip automatically after the version bump, so every released .jzip is version-pinned in lockstep with the published npm package.

Patching jsPsych

We depend on a small set of modifications to jsPsych and (eventually) its plugins. These live as version-tagged diff files under patches/, managed by patch-package and applied automatically by the prepare script.

To add or modify a patch:

Edit the file under node_modules/<pkg>/... directly.
Run npx patch-package <pkg> — this writes/updates patches/<pkg>+<version>.patch.
Commit the .patch file and add an entry to patches/README.md explaining what and why.

prepare runs patch-package --error-on-fail, so a version bump that the patch can't apply to fails npm install outright — version drift is a hard error, not a silent warning. Bump jsPsych / plugin versions and patches in the same commit, regenerating patches against the new version.

We use the prepare lifecycle (not postinstall) so that patch-package only runs when npm install is invoked inside this repo. Consumers installing fpt-battery from npm or a tarball never execute it — patch-package is a devDependency and would not be on their PATH.

Scope of the patches: patch-package only runs at install-time in this repo, so the patches are baked into dist/ when we build. Consumers installing fpt-battery get the patched behaviour everywhere it's used inside the bundle. If a consumer also imports jspsych directly in their own code (e.g. to write a custom plugin), that import resolves to their own unpatched node_modules/jspsych — the patches do not propagate.

TODO

[ ] add "write_event_to_interaction_data" patch on jsPsych core
[ ] check if task durations could also be inferrable from the timer settings
[ ] Add better assets management - autopublish/link on the hosted path or create a new server.
[ ] conditional_forecasting_task

Future potential upgrades

typescript
zot validation (rather than the current handrolled one)
package release flow
- move jatos .jzip to that release bundle
use PRs rather than direct commits to main

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Quick start

Option 1: CDN

Option 2: NPM

Hosting online

Jatos

Github Pages + DataPipe + OSF

Configuration/Customization

Core battery settings

Tasks

number_series

leapfrog

cognitive_reflection

admc_framing

admc_decision_rules

admc_risk_perception

bayesian_update

coherence_forecasting

denominator_neglect

graph_literacy

impossible_question

raven_matrix

time_series

forecasting_questions

Data saving

Session restart

Customize welcome/debrief trials (as well as other trial generators)

Extending or replacing a task

Simulation

battery.simulate(mode, options)

Overriding individual trials and tasks

Available simulation_options keys

number_series

leapfrog

cognitive_reflection

admc_framing

admc_decision_rules

admc_risk_perception

bayesian_update

coherence_forecasting

denominator_neglect

graph_literacy

impossible_question

raven_matrix

time_series

forecasting_questions

Simulating a custom task

Development

Local dev

Testing

Linting & formatting

Versioning & commit messages

Semantic versioning

Conventional commits

Publishing a new version

Assets

Building the JATOS .jzip

Patching jsPsych

TODO

Future potential upgrades

`number_series`

`leapfrog`

`cognitive_reflection`

`admc_framing`

`admc_decision_rules`

`admc_risk_perception`

`bayesian_update`

`coherence_forecasting`

`denominator_neglect`

`graph_literacy`

`impossible_question`

`raven_matrix`

`time_series`

`forecasting_questions`

`battery.simulate(mode, options)`

Available `simulation_options` keys

`number_series`

`leapfrog`

`cognitive_reflection`

`admc_framing`

`admc_decision_rules`

`admc_risk_perception`

`bayesian_update`

`coherence_forecasting`

`denominator_neglect`

`graph_literacy`

`impossible_question`

`raven_matrix`

`time_series`

`forecasting_questions`