@iucteam/horizon
v2.2.30
Published
> Version: 2.2.x • Audience: Developers / Contributors • Licence: Apache‑2.0
Readme
Horizon Framework — Developer Documentation
Version: 2.2.x • Audience: Developers / Contributors • Licence: Apache‑2.0
Contents
- Overview
- Quick start (run and test locally)
- Think → Reason → Evaluate lifecycle (canonical phases)
- Agent reference (Director, Sensor, Recorder, RemoteAgent)
- EvaluationBrain — detailed behaviour
- Workflows, Tasks, Observations — domain model
- Models, Prompts and Memory
- Service primitives (TaskProcessor, Discovery, EventBus)
- Extending Horizon (Probes, Repositories, Models, Agents)
- Examples (end‑to‑end) + PlantUML diagrams
- Tests, CI and contributing
1. Overview
Horizon is a TypeScript framework for building hierarchical multi‑agent systems that combine programmatic workflows and LLM‑assisted reasoning. It is explicitly designed for developers: small, composable primitives (Agents, Brains, Probes, Repositories, Models and Workflows) that you can extend and compose into production systems or research prototypes.
Design principles:
- Single responsibility: each agent type has a narrow role.
- Observations as the lingua franca: agents exchange state as
Observationobjects (named or unnamed). - Pluggable reasoning: Brains mediate between requirements and actionable plans using models and memory.
- Deterministic + AI workflows: support both hand‑coded
Workflowgraphs and LLM‑generated plans. - Testability: NoOp implementations and local discovery are available to write deterministic tests.
2. Quick start
Prerequisites: Node >= 18. Clone the repo, install, build and run tests.
git clone <your-repo>
cd packages/core
npm ci
npm run build
npm testRun integration tests in packages/integration-test to see realistic scenarios (DynamoDB team, workflow directors).
3. Think → Reason → Evaluate (canonical phases)
Every agent in Horizon follows the same high‑level lifecycle. This lifecycle is the organising principle for how Brains, Models and Memory cooperate. It’s important to understand because unit tests, instrumentation and extension points align to these phases.
Phase 1 — THINK
- Purpose: translate a requirement (natural language or structured) into an action plan.
- Who: primarily Directors (planning) and SensorBrains (parameter extraction).
- Input:
PlanningSubjectorRequirementSubject(includes requirement string, available agents, errors, and contextual observations). - Output:
Plan,Workflow, orParameterThought— structured enough to be executed.
Typical responsibilities of THINK:
- Use a thinking model (e.g.
LangChainThinkingModel,WorkflowThinkingModel) to decompose a goal into tasks. - Consult
Memory(if available) using aminMemoryConfidencethreshold — reuse cached plans when confidence is high.
Phase 2 — REASON
- Purpose: process raw observations produced by actuation and produce candidate answers or intermediate inferences.
- Who: Director (aggregating results), Sensor (optionally post‑processing), Recorder (validation of persisted result).
- Input:
ObservingSubject(observations, parameters, requirement). - Output:
Candidate(an answer/value with confidence and provenance).
Typical responsibilities of REASON:
- Run a reasoning model (e.g.
LangChainReasoningModel) to synthesise observations. - Produce
Candidateobjects with confidence scores and optional provenance.
Phase 3 — EVALUATE
- Purpose: determine whether a Candidate satisfies the original requirement and decide to accept, retry, or escalate.
- Who: Director or Evaluation brain responsible for final acceptance gating.
- Input:
EvaluatingSubject(requirement, candidate, observations, additional params). - Output:
Evaluation(accepted: boolean, score, optional feedback).
Typical responsibilities of EVALUATE:
- Apply heuristics, policies or an evaluating LLM to accept or reject a Candidate.
- Provide explicit feedback to be stored or used in subsequent planning iterations.
4. Agents (reference)
Base Agent
All agents implement a common interface and lifecycle. Key fields:
metadata: AgentMetadata— { agentId, agentType, name, description, endpoint }brain: Brain— pluggable decision makerruntimeConfig: RuntimeConfig— execution limits (maxSteps, minMemoryConfidence)
Core methods:
accept(task: Task): Promise<Future>— entry point to accept and process a taskdoExecute(task: Task, sync?: boolean)— internal executor performing THINK→ACT→REASON→EVALUATE
Agents emit lifecycle events via the global EventBus (onTaskStarted, onTaskCompleted, onTaskFailed).
Director
Role: plan and coordinate.
Behaviour:
think()produces aWorkflow.- Executes each planned Task by discovering collaborators (local or remote) and delegating execution.
- Runs REASON on aggregated observations and EVALUATE on returned Candidates.
Important: Directors use DirectorBrain which wires thinking, observing and memory models.
Sensor
Role: collect data.
Behaviour:
- Use
SensorBrainTHINK to extract parameters according to a probe schema. - Call its
Probe.generateObservations()(async iterator) to produce observations. - Optionally run REASON to post‑process observations.
Probes include RestApiProbe, WebSearchProbe, WikipediaProbe and a NoOpProbe for tests.
Recorder
Role: persist data.
Behaviour:
- THINK to extract persistence parameters (e.g. which fields to save)
Repository.execute()to perform save/delete- Generate an Observation representing storage success/failure
Repository interface is intentionally small so you can implement connectors for DynamoDB, filesystem, SQL, etc.
RemoteAgent
Role: forward tasks to remote Horizon instances over HTTP.
Behaviour: wrap remote endpoints behind the same Agent API so Directors can be location‑agnostic.
5. EvaluationBrain (detailed)
The EvaluationBrain (referred to in code as the RecorderBrain or evaluation component depending on agent) is responsible for the EVALUATE phase: deciding if the produced Candidate satisfies the requirement.
Key responsibilities:
Take an
EvaluatingSubjectwhich contains:requirement(Stringifiable)candidate(Candidate)observations(Observation[])additionalParameters
Use a configured evaluating model (via
ModelFactory.get(agentId, 'evaluating')) to produce anEvaluationobject.Return an
Evaluationwith:accepted: boolean— whether the candidate is acceptablescore: number— numeric metric useful for ranking / thresholdingfeedback?: Stringifiable— optional textual guidance for retries or human readable reason
Implementation notes:
- The default
LangChainEvaluatingModeluses a system + human prompt split and yields a short evaluated string that is then wrapped byDefaultStringifiable. - Evaluators should be idempotent and deterministic where possible — otherwise tests will be flaky.
- The Director will check
Evaluation.acceptedto either finish or continue to the next attempt (based onruntimeConfig.maxSteps).
Example: accept only if score >= 0.8 and no critical feedback.
const evaluation = await evaluationBrain.evaluate(new EvaluatingSubject(requirement, candidate, observations, { agentId }))
if (!evaluation.accepted) {
// feed back into planning or raise a manual review event
}6. Workflows, Tasks, Observations — domain model
Workflow
A Workflow is a directed graph of Nodes. Nodes can be:
- Simple nodes that contain a list of
Stepobjects - End nodes
- Default/start nodes
Workflows support parameterised tasks: nodes can set requirement and observationName and pass contextual observations forward.
Task
A Task contains:
idtaskTypeagentType(director, sensor, recorder)parameters(includingrequirement)deliveryMetadata(transport hints)
Observation
Observations can be:
SimpleObservation(string)ObjectObservation(structured JSON)DocumentObservation(long text)
Observations include parameters (metadata) and name (optional named observation). Use aggregateNamedObservations() helpers to collect named results.
7. Models, Prompts and Memory
Horizon decouples Model implementations from Brains. Models implement process(subject) and return types appropriate to the phase.
ModelFactory
- register models with
ModelFactory.registerModel(agentId, modelType, model) ModelTypeis'thinking' | 'reasoning' | 'evaluating'ModelFactory.loose()returns a factory that yieldsNoOpModelwhen missing (handy for tests)
PromptBuilderFactory
- register prompts per agent + role (human/system + thinking/reasoning/evaluating)
- default behaviour returns a
StringPromptLoader('{{requirement}}')whenallowMissingis true
Memory
DecayingHashMemoryandNoOpMemoryare provided- Brains consult Memory during THINK to reuse previous plans when confidence is high
8. Service primitives
TaskProcessor
The TaskProcessor is the orchestration entry point. It performs:
- Agent discovery (local or remote)
- Agent selection (via
AgentSelector) - Delegation and synchronous/asynchronous execution
Factory helpers:
TaskProcessor.withLocalAgents(agents)
TaskProcessor.withAgentDiscovery(agentDiscovery)AgentDiscovery
Local (in‑process) and remote discovery implementations available. Register your agents with the discovery service to make them discoverable by Directors.
EventBus
Event stream for lifecycle events: onTaskStarted, onTaskCompleted, onTaskFailed — useful for metrics and observability.
9. Extending Horizon
New Probe
Implement the Probe interface:
interface Probe {
defineParameterSchema(): Promise<Record<string,string>>
generateObservations(parameters: ParameterThought, context: Observation[]): AsyncIteratorObject<Observation, Observation, void>
}Return observations using an async iterator so sensors can stream results without blocking.
New Repository
Implement RecorderRepository:
interface RecorderRepository {
defineParameterSchema(): Promise<Record<string,string>>
execute(parameters: ParameterThought, context: Observation[]): Promise<Observation>
}New Model
Extend Model and implement process(subject) returning a Workflow | Stringifiable | Evaluation depending on phase.
Register via ModelFactory.registerModel(agentId, modelType, model).
New Agent Type
Extend Agent and implement doExecute() and getPlanningSubject() appropriately. Use EventBus to emit lifecycle events.
10. Examples + PlantUML diagrams
Below are canonical diagrams you can paste into your .gitlab-ci or docs site to render. Replace names and ports to match your deployment.
High level architecture
@startuml
node "Director" as Director
node "Sensors" as Sensors
node "Recorders" as Recorders
node "EventBus" as EventBus
Director --> Sensors : plan / delegate
Sensors --> EventBus : observations
EventBus --> Director : publish
Director --> Recorders : persist
@endumlAgent lifecycle (Think → Act → Reason → Evaluate)
@startuml
start
:Receive Task;
:THINK (Brain.think) -> Plan/Parameters;
if (plan contains steps?) then (yes)
:ACT (Agent.act) -> call Probe/Repository/Remote;
:COLLECT Observations;
:REASON (Brain.reason) -> Candidate;
:EVALUATE (Brain.evaluate) -> Evaluation;
if (Evaluation.accepted) then (accepted)
:Emit onTaskCompleted;
stop
else (retry)
:if (steps < maxSteps) -> loop to THINK
else -> Emit onTaskFailed
endif
else (no)
:Emit onTaskFailed
endif
@endumlSample workflow (documentation generator)
@startuml
title Documentation generation workflow
start
:translate_requirement -> create_diagram;
:create_diagram -> format_document;
:format_document -> write_document;
stop
@enduml11. Tests, CI and contributing
- Linting and formatting follow
.eslintrc.jsonand.prettierrc. - Test suites are under
packages/*/src/test/ts. - CI pipeline in
.gitlab-ci.ymlrunsinstall,test,lint,auditandpublishsteps. - Use
changesetsto propose releases.
Contributing checklist:
- Create a branch from
main. - Add tests for new behaviour.
- Run
npm run lint && npm run build && npm test. - Open MR and request review.
12. Appendix — Useful snippets
Register a model (example)
const mf = ModelFactory.loose()
mf.registerModel('myAgent', 'thinking', new WorkflowThinkingModel(...))Create a sensor (example)
const sensor = new Sensor(
{ agentId: 'webSensor', agentType: 'sensor', name: 'Web sensor' },
new SensorBrain(...),
new RestApiProbe(httpClient),
{ maxSteps: 3, minMemoryConfidence: 0.6 }
)