ultravisor

v1.0.4

Published

2 days ago

Cyclic process execution with ai integration.

0High
0Medium
0Low

Ultravisor

Like a supervisor, only instead of super it's ULTRA. The hidden sixth hero of Voltron. This tool allows you to run commands on schedule and process output with llm models, providing user-friendly summaries and links back to the full complete output from command running.

Use Cases

Periodic data pull from a REST API with localized storage
Data integrations between one system and another
Automatic execution of image generation models with randomized word choices
Parsing of massive sets of files for meaning leveraging ai

Primary Concepts

Ultravisor operates on a few key concepts:

operation - one or many tasks that run in sequence and/or parallel
task - this is the verb... "what to do"
node - a specific executing ultravisor (often just one, but can be a cluster)
global state - data shared across all tasks
node state - data accessible locally only (when running with node affinity)
metaoutput - the output from tasks and operations
operation staging - temporary data from an operation
output file store - final file output from an operation
output data store - flexible database of final outputs from operations

Operations

An operation is, generally speaking, a set of tasks. Tasks can be composed to react to certain data/state situations. With clever use of global state, they can even perform complicated series of actions across operations.

For instance, I could have one operation that runs a number of tasks on a schedule to pull temperature from various locations in my house. Each task can store the values in some kind of API, and, keep track of the latest value as well as timestamp in global state. These tasks might run every 5-10 seconds or so; where they store the data may be

Then, a second operation could be running every minute that inspects the global state and adjusts a localized thermostat based on the current temperatures.

Or maybe I have a NAS full of video files that I would like to run a series of machine learning algorithms on, to generate metadata on when scene cuts were made in each film. I could setup an operation with a task to watch the cpu, iops and memory load of a machine and run parallel tasks that saturate the iops/cpu on the machine. This would be a scheduled, parameterized parallel set of tasks.

Tasks

Tasks can be really any program that is executable from a local shell, or, within a browser environment. The browser part is especially interesting since you can basically script sets of actions within web pages (e.g. downloading all the files listed on a page... and/or enumerating all the pages and running the recursive download task for each page).

Tasks can be executed based on a number of scenarios:

On a schedule (5pm every day, every 5 minutes, up to 5 times a minute)
After a condition is met (a condition in a browser, a disk drops below 5g)
After another task is completed (ffmpeg returns successfully transcoding)

Task Model

Each task has at minimum a GUID, Code and Name, as well as a type and parameters. There is also a description to store markdown content about the task.

Task Types

Tasks have a type which dictates how they are executed:

Command - a shell command executed on the node
Browser - a headless browser task that can navigate, click and interact
Browser Read - a headless browser task that only reads data from pages
Browser Action - a click, navigation or data fill action for a browser
Request - an HTTP request task for pulling/pushing data from/to APIs
Store File - a task for storing files in the output file store
Read File - a task for reading files from the output file store
Database Table - a task for creating a table in the output data store
Integration - a meadow integration task

Nodes

Any execution of Ultravisor. Can be distributed across multiple machines.

When run with multiple nodes, there is a star topology with the capability for nodes to self promote to be the central node if the central node goes away.

Global State

Global state is simple/plain JSON object that's synchronized across nodes. It is meant to be used to store small facts and not process management state (e.g. it isn't an appropriate storage location for a mutex lock but is great for the most recent temperature readings from a set of sensors).

Node State

Node state is similar to global state, but is only accessible to the local node the task is running on. This is useful for storing configuration, paths or other node-specific information.

Metaoutput

Metaoutput is the output from running tasks and operations. It is stored in a simple JSON manifest including consistent fields for understanding:

Summary of Output
Start Time, Stop Time, Status, Success/Failure for the Operation
Start Time, Stop Time, Status, Success/Failure for each Task
Output Data (text, json, binary, files, etc.)
Run Log(s) for Tasks
Links to Output
Anything Else the Task/Operation Wants to Report

Operation Staging

Operation staging is temporary storage for data during the execution of an operation. This is useful for storing intermediate files, logs, or other data that is not needed after the operation completes.

Operations can be flagged to have node affinity, or not. If affinity is set, all tasks for the operation will run on the same node, allowing for enormous local file stores for the operation itself. This is great for Machine Learning operations that need to store large source files, intermediate learning data and final output files.

Output File Store

The centralized location for all final output files from operations. This is meant to be a persistent storage location that is accessible across nodes.

Output Data Store

The output data store is a flexible database for storing final output data from operations. This can be used to store structured data, metadata, or any other information that needs to be queried or analyzed later.

Related Packages

orator - API server abstraction
pict - MVC application framework
fable - Application services framework

License

MIT

Contributing

Pull requests are welcome. For details on our code of conduct, contribution process, and testing requirements, see the Retold Contributing Guide.