npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

melete-ai

v0.157.0

Published

Melete — the Self-Driving Discovery Brain. A closed-loop active-experiment engine with a pluggable oracle and a signed, offline-verifiable discovery trace. Mneme remembers; Melete discovers.

Downloads

11,441

Readme

Melete

The Sovereign Verifiable AI Analyst & Optimizer

Find the best — and most robust — settings for any system you can measure, in the fewest experiments — then hand over a signed verdict anyone can re-verify offline.

🌐 Live demo → melete.mneme-ai.space

MIT · zero runtime dependencies · runs on your machine


What it is (in one line)

You have a system you can measure — an ML pipeline, a server/DB/network config, a recipe, a simulation. Melete proposes the next setting to try, you measure it (or give a formula), and it converges to the best stable answer — then explains why in plain language and signs a verdict you (or an auditor) can re-check offline. Your data never leaves your machine.

Use it in 60 seconds — 3 ways

1) Through the website (no code). Open the live demo, pick your field (Pharma · Semiconductor · AI/ML · …), press Watch to see it discover, or use guided mode: it proposes → you measure in real life → you type the score → repeat.

2) CLI / npm (on your own machine):

npm i -g melete-ai
melete bench            # measured: beats random / grid search
melete gauntlet         # every engine's correctness check (must be 100)
melete poopt cert.json  # verify a signed certificate offline

3) API — connect your real process (air-gapped). First, get an endpoint to call — two options:

# A) hosted demo (quick try):        base URL = https://melete.mneme-ai.space
# B) self-host (sovereign — data never leaves your machine):
npm i -g melete-ai
melete-server                         # → serves on http://localhost:8790

Then POST to that base URL:

POST /next             { space, observations }              → the next setting to try
POST /aegis            { space, objective, budget }         → the best ROBUST setting (survives wobble)
POST /discover         { space, objective, budget }         → full run + signed Sovereign Verdict + Replay Token
POST /sovereign/verify { …verdict }                         → re-verify provenance OFFLINE
POST /replay/verify    { …token }                           → re-derive the decision step-by-step OFFLINE

…or skip HTTP entirely and call the library in-process: import { sovereignAnalyze, aegisDiscover, proposeNext } from "melete-ai".

✦ What's inside — by category

68 independently-verified modules. Every claim below is a check you can re-run: npx melete-ai gauntlet.

🔍 Optimize — the best setting in the fewest experiments

| capability | what it does | |---|---| | Adaptive discovery | a portfolio of search strategies reaches 99% of the optimum in ≈12 experiments — ≈8× fewer than random (measured over 300 seeds: avg 12.2, 300/300 reached; melete bench) | | Mixed spaces | real · integer · categorical · conditional knobs, not just dials | | Multi-objective | the Pareto front of best trade-offs (yield and cost) | | Noise-robust | the value you can trust under measurement noise, not a lucky spike |

import { proposeNext } from "melete-ai";          // loop: propose → you measure → repeat
const { next } = proposeNext({ space:[{name:"pH",type:"real",min:3,max:9}], observations:obs, goal:"maximize" });

Hosted, no install: POST https://melete.mneme-ai.space/next

🛡 Trust & verify — the honesty stack (no other optimizer ships this)

| certificate | the question it answers — signed, offline-verifiable | |---|---| | 🏅 Trustworthy Discovery | is it REAL (not noise) · CAUSAL (not confounded) · ROBUST (survives wobble)? | | 🏔 Stability | is the optimum reproducible, or a lucky one-off? (STABLE ⇒ reproduced ≥97.5%, measured) | | 💎 Honest-Search Proof | is this a GENUINE search or a FAKED one? Re-derive the trace offline (no oracle) — a forgery is rejected. (360/360 forgeries caught; something an LLM cannot do) | | 🛡 Tolerance Certificate | the certified ±tolerance that still keeps ≥90% of the optimum — a worst-case Lipschitz guarantee, not an average. (8400/8400 off-grid adversarial samples held the floor) | | 📜 Proof of Improvement | switching from setting A to recipe B is a proven gain of ≥Δ — noise-aware 97.5% lower bound; refuses within noise. Common-random-numbers pairing certifies the same gain from ~8× fewer measurements; sequential early-stopping (Bonferroni α-split) stops the moment the gain is certified — ~1.9× fewer on average (41.9 vs 80). (Δ valid ≥97.5%, false-cert ≤2.5%) | | 🔐 Pre-Registration | commit the objective, space, budget & decision rule before running, then prove the result obeyed it — no goalpost-moving, no cherry-picking. (6 deviation classes all rejected; the scientific-integrity layer) | | 🪨 Decision-Breakdown | how many measurements would an adversary (fraud, a glitchy sensor) have to corrupt to flip your "B beats A" verdict? The exact tamper-distance — a strong clean call survives many corruptions, a marginal one flips on one. The cert ships the explicit minimal attack (a witness you re-apply), takes an arbitrary adversary range (real sensor/physical bounds), and a stronger adversary provably never raises the count. (witness truly flips 100%; monotone 100%; an inflated claim caught 100%) | | 📉 Winner's Curse | you searched N settings and reported the best — but that number is inflated (it's the max of N noisy trials, partly luck). The signed selection correction: the winner's TRUE value is ≥ this de-biased lower bound, the discount grows with N, and it works with σ unknown (estimated from replicates, studentized). (valid bound ≥97.5%, measured 99.5%; with σ estimated a plain plug-in breaks at 94.9% — studentized holds 99.3%; naive overstates 90%) | | 🧭 Extrapolation-Guard | is the recommended setting inside the data you measured, or a blind extrapolation? It's flagged with an exact separating-hyperplane witness — proof it's outside the convex hull of your evidence, in any direction (not just out-of-box; it catches an in-box point that's off a correlated-knob manifold, which an axis test misses) — plus a density signal for interior voids. (out-of-box & in-box-off-hull → flagged 100% with a valid, re-verifiable witness; never false-flags an in-data point; a fake "supported" is caught) | | ⏱ Anytime-Valid | an AI agent peeks after every experiment — and naive "stop when p<0.05" then false-alarms ~40% of the time. An e-value martingale (Ville's inequality) stays valid under unlimited peeking + optional stopping, plus a time-uniform confidence sequence — a running interval on the gain valid at all times at once, so the agent can read the estimate at any peek and trust it. (FP ≤ α measured 2.4% vs naive 42%; the CS covers the true gain uniformly 97.4% where a naive per-peek CI holds only 58%; it tightens as evidence accrues) | | 📏 Conformal Prediction | "ŷ ± 1.96σ" assumes Gaussian noise — wrong on real skewed/heavy-tailed data. Split-conformal wraps any predictor with a distribution-free interval, coverage guaranteed ≥ 1−α, finite-sample exact (exchangeability, no assumption). The normalized (adaptive) mode scales by a per-input difficulty so coverage is balanced across input regions under heteroscedastic noise, not just on average. (coverage on target across normal/heavy/skewed — spread 0.2pp vs Gaussian's 2.8pp; 22% tighter than the over-covering Gaussian on skew; exact at n=20; under input-dependent noise plain under-covers the hard region 83% while adaptive balances 90%/90%) | | 🎯 Calibration v2 | when a model/agent says "90% sure", is it right ~90% of the time? Two tests, not one: the global Spiegelhalter Z names over/under-confidence, and a per-bin Hosmer-Lemeshow test catches mid-range miscalibration near p=0.5 where the global Z is structurally blind — Bonferroni-split so the combined false-flag stays ≤ α. Reports ECE, recalibrates, and localizes the worst-calibrated bin. (global Z alone catches the mid-range blind spot only 2.7%; the v2 conditional test catches it 100% and localizes it 100%; calibrated falsely flagged ≤ α; over/under-confidence detected 100% with direction named 100%; recalibration cuts ECE 10.2%→6.4%) | | 👥 Subgroup Validity | "B beats A +3% overall" can hide a segment B actively harms (Simpson's paradox). It tests the effect in every subgroup — declaring UNIFORM-IMPROVEMENT via an intersection-union test (level α, correctly not over-penalized) and flagging HARMED-SUBGROUP via Holm step-down (more powerful than Bonferroni at the same family-wise error), naming the hurt one. (detects + names a harmed segment 100%; pooled test says "improvement" while a segment is harmed 99% — the trap; uniform-claim power 27% vs 6% naive, size-controlled; Holm beats Bonferroni at FWER ≤ α) | | 🔒 Privacy v2 (new) | sovereignty keeps your raw data home — but the moment you share an aggregate (a federated mean, a published stat, a pooled gradient) you can leak the individuals in it (membership inference). This certifies the release is (ε,δ)-differentially private via the tight analytic Gaussian (Balle-Wang) — the minimum noise — reveals only the noised value, and signs it. v2 adds a zCDP accountant: the cumulative budget accumulates ρ additively (composes like √k, not k like basic Σε), so far more releases fit under one budget while staying provably sound. The dishonest failure it catches: under-noising while claiming a small ε. (the optimal membership-inference attack on a certified release stays inside the (ε,δ) region; 1/5 the noise leaks far outside it and is rejected; tight calibration — achieved δ = target δ; for 50 releases zCDP certifies ε=5.3 vs basic 25; under one (ε=3,δ=1e-5) budget zCDP admits 17 releases vs basic's 6 — and the composed attack stays sound) | | 🗑 Unlearning v2 (new) | the "right to be forgotten" with proof, in batches. When users ask to be deleted, a provider's cheapest move is to do nothing and claim it's done. This forgets a whole batch of k records from a ridge model EXACTLY via a Woodbury block rank-k downdate — O(k³+kd²), touching only those records' own contributions, never the other n−k rows — proves the served model equals one retrained from scratch without them and equals deleting them one-by-one (sequential), reports the batch's influence + the residual influence left behind (must be ~0), and signs it. An auditor re-derives it from the Gram matrix alone (never the raw rows). The dishonest failure it catches: fake / partial deletion. (the block downdate equals full retraining and sequential deletion to ~1e-15; a provider that secretly keeps the batch is flagged RESIDUAL-INFLUENCE orders of magnitude above tolerance and a forged "DELETED" cert is rejected; forgets a batch without retraining) | | 🌐 Distribution-Shift (DRO) v2 (new) | every optimizer reports the value it measured on the data it saw — but deployment data drifts (the customer mix, the traffic, the population), and a setting that looks best on the nominal data can collapse under a modest shift. This certifies the worst-case mean over every distribution within a χ²-divergence ball of radius ρ — V = mean − √(ρ·Var), the exact Cauchy-Schwarz-tight bound — guaranteeing "under any shift up to χ² ≤ ρ, the value is provably ≥ V". AEGIS guards input wobble and Tolerance parameter wobble; this guards the data distribution itself, so a fragile high-variance setting is out-ranked by a robust one. v2 adds a confidence mode: setting ρ = z²/n makes V a calibrated (1−α) lower bound on the true mean — robust to sampling error too — the Duchi-Namkoong unification that DRO and a confidence interval are the same object. (over 24,000 reweightings none beat the certified worst case and the aligned adversary achieves it exactly; monotone in ρ; confidence mode covers the true mean 94.5% on light tails ≈95% and over-covers on skew; the confidence-mode bound equals the textbook one-sided CLT bound mean−z·SE exactly) | | ⚖️ Fairness v2 (new) | regulators (the EU AI Act, fair-lending law) demand proof an automated decision doesn't discriminate — but the naive "the rates look equal" check is a trap twice: a real gap can hide in sampling noise, and a harmless wobble can be mistaken for bias. This measures the demographic-parity gap (and, with outcomes, the equalized-odds TPR/FPR gaps) each with simultaneous Bonferroni-corrected Wilson confidence intervals, then returns a calibrated verdict — FAIR / UNFAIR (naming the metric + groups) / INCONCLUSIVE — and signs it. v2 adds intersectional fairness: it tests every intersection of protected attributes too, catching the fairness-gerrymandering bias that hides at an intersection while each attribute alone looks fair. (a biased model is detected + named 100%; a truly fair model is falsely accused ≤ α; the gap CI covers the true gap ≥ 1−α; an XOR-gerrymander model that every marginal test passes is caught UNFAIR at the named intersection 100%, with no false alarm ≤ α) | | 🧩 Attribution (new) | "why was I denied?" is a legal right (GDPR, EU AI Act) — but a single SHAP run or an LLM's rationalization gives numbers nobody can check, and a vendor can quietly tilt them. This computes the exact Shapley value for each feature from the model's own coalition value table and proves the fairness axioms — the credits sum exactly to the prediction (efficiency), identical features get equal credit (symmetry), a do-nothing feature gets zero (dummy), and attribution is additive across models (linearity) — then signs it. A tilted explanation whose credits don't add up to the prediction is rejected on re-derivation. (efficiency holds to ~1e-14; dummy = 0 exactly; symmetric features get exactly-equal credit; linearity to ~1e-14; a pairwise interaction is split fairly +0.2/+0.2; an inflated credit breaks efficiency and is caught) | | 🤝 Verification Receipt (new — two-sided) | every certificate above is a one-way proof an issuer signs about themselves. This turns it two-sided: a verifier (regulator / auditor / customer / counterparty agent) re-derives the issuer's certificate offline and counter-signs a receipt bound to it with their own key. Who benefits: ① the issuer gets a portable, independently counter-signed attestation (worth more to a buyer/regulator than a self-signed claim); ② the verifier gets an offline-checkable record of what they verified and when, tamper-evident — protection if the decision is challenged. Neither has to trust the other. (a receipt over a genuine cert verifies; it's bound to that exact cert (a different cert is rejected); a tampered cert yields a REJECTED verdict and a forged "VERIFIED" receipt is caught; issuer≠verifier independence is enforced — no self-rubber-stamp; works across every certificate kind) | | 📑 SLA Certificate v2 (new — two-sided) | AI is sold on uptime SLAs + "trust us" on quality. This puts the quality in an enforceable, both-party contract: the provider commits measurable terms (calibration ECE ≤ 5%, fairness gap ≤ 0.1, accuracy ≥ 90%, p95 latency ≤ 200 ms — each able to bind to its signed metric certificate) and each period is signed PASS, or BREACH naming exactly which term failed and by what margin. v2 adds a hash-chained compliance ledger over the billing cycle — a tamper-evident history with auto-accrued penalty (removing/altering a period breaks the chain). Who benefits: ① the provider turns "our model is good" into an enforceable promise + a signed track record that wins enterprise deals; ② the consumer gets a guarantee with teeth — provable breaches + the penalty owed, offline-checkable, so refunds aren't he-said-she-said. (compliant→PASS; a drifted term→BREACH named with margin; multi-breach all named; ≤/≥ handled; forged PASS rejected; the ledger computes breach-rate/streak/penalty exactly and catches a tampered or hidden period) | | ✍️ Consent Certificate (new — two-sided) | GDPR consent is a checkbox in a database the company can rewrite. This makes it a two-party signed artifact: the data subject signs a scoped grant (which purposes, which fields, an expiry); the controller's every use is adjudicated against it (ALLOWED / DENIED, re-derived from the grant) and signed; the subject can sign a revocation. Who benefits: ① the subject holds a signed record of exactly what they agreed to and can prove any out-of-scope / expired / post-revocation use (real recourse); ② the controller holds signed use-certificates proving each use was within consent — an audit-ready, liability-bounding trail. (in-scope use ALLOWED; off-purpose / off-field / expired / post-revocation use each DENIED + named; a use before a later revocation stays ALLOWED; a controller forging ALLOWED is rejected on re-derivation; subject≠controller two-party chain) | | 🎫 Trust Passport (new — two-sided) | a vendor shouldn't hand a regulator eight separate proof files. The passport composes many certificates (fairness + calibration + privacy + SLA + consent…) into one signed bundle — each member bound by an order-independent merkle root — that re-verifies every member in a single offline call and names any that fail. It's itself a signed cert, so the Verification Receipt counter-signs the whole bundle. Who benefits: ① the issuer ships one portable artifact (a swapped/tampered member is caught); ② the verifier checks the entire compliance posture at once + sees exactly which member failed, then counter-signs once. (composes 3+ kinds → ALL-VERIFIED; a tampered or swapped member is rejected by hash-binding; a forged "all-verified" with a failing member is caught; merkle root is order-independent; a two-party receipt over the passport verifies) | | 🧬 Model Supply-Chain (AIBOM) (new — multi-party) | a deployed model is many parties' work — a base-model vendor, a fine-tuner, an optimizer, a deployer. This is a hash-chained AI Bill of Materials where each step is signed by the key of the party responsible for it (4+ distinct signers) and declares the prior artifacts it consumed. Any downstream consumer verifies the whole provenance offline. Who benefits (≥3): the base-model vendor (attribution + scoped liability), the fine-tuner/optimizer (prove their layer), the deployer (prove an unbroken lineage), the regulator/end-user (verify the whole chain + who is accountable). (a 4-party chain with 4 distinct signers verifies + names each; tamper / reorder / remove / impersonation / a broken-provenance link are each caught; rides inside a Trust Passport) | | 🕵️ Private Audit Proof (new — multi-party · flagship) | the wall every AI audit hits: to verify a claim you must be handed the model and the whole (often private/regulated) dataset. This breaks the deadlock — the vendor Merkle-commits every per-record outcome, a Fiat–Shamir challenge derived from that commitment selects a tiny random sample the vendor cannot cherry-pick, and the auditor checks the claim on just those k records (e.g. 300 of 100,000). A claim inflated past tolerance is caught with probability rising toward 1 in k. Audit without handing over the data. Who benefits (≥3): the vendor (proves compliance without exposing model/data), the auditor/regulator (sound audit of a tiny sample), data subjects (minimal exposure), a relying party (re-checks offline). (honest claim accepted ~100%; a true-80%/claim-90% cheater caught 86%→96%→100% as k=30→100→300; only ~0.3% of records revealed; a tampered opening fails its Merkle path; the sample is a pure function of the committed root) · HONEST: not zero-knowledge (the k sampled records are seen) and not a SNARK — a data-minimizing, binding, sound spot-check; soundness is in the random-oracle model, a grinding prover faces work ~1/(1−ε)^k. | | 🎟️ Proof-Carrying Answers (new — multi-party · runtime) | batch audits prove a model was good last quarter; this proves THIS answer, right now. Every output ships a tiny signed trust tag a consumer (or another agent) verifies offline in microseconds: is the input inside the model's certified evidence envelope (with a re-derivable witness if not), is the confidence from a calibrated model (bound by hash), what is the bound provenance (AIBOM lineage + SLA) — verdict TRUSTED / OUT-OF-SCOPE / NEEDS-REVIEW. The runtime trust layer for every AI answer; the missing primitive for multi-agent AI. Who benefits (≥3): the provider (answers carry their own trust), the consuming agent (verifies + safely rejects out-of-scope answers), the platform/regulator (audits the signed stream), the end user (protected from confident-but-unbacked answers). (in-scope→TRUSTED 100% with zero false flags; out-of-scope flagged 100% with a witness dimension; under-confident→NEEDS-REVIEW; a forged TRUSTED is rejected on re-derivation; ~800-byte proof, O(d) data-free verify) · HONEST: proves an answer is BACKED + IN-SCOPE + from a calibrated, provenance-bound model — not that it is factually correct (no per-answer oracle); it catches the out-of-scope / under-confident answers most likely to be wrong. | | 🌍 AI Transparency Log (new — ecosystem · flagship) | Certificate Transparency (RFC 6962) made the whole web's TLS accountable — every certificate publicly logged, append-only, Merkle-auditable. This is that mechanism for AI claims. Every Melete certificate is appended to a public, tamper-evident Merkle log; anyone can prove a claim is included, and prove the log never rewrote history (a consistency proof between two signed tree heads). A vendor can no longer show a fair certificate to one auditor and bury the biased one. The global accountability substrate the whole stack sits on. Who benefits (a whole ecosystem): submitters (public, non-repudiable record), auditors/light-clients (verify inclusion + append-only offline), monitors (regulators/journalists/public detect a rewrite or fork), end users (trust only what is logged). (inclusion provable for every claim across sizes 1..100; a non-member rejected; append-only consistency proven for every m<n; rewriting a past claim is caught as inconsistent with the old signed tree head; a split view is caught; tree heads Ed25519-signed) · HONEST: proves WHAT was logged + that history was not rewritten; it does not force anyone to log (a policy/ecosystem incentive, exactly as with web CT), and a leaf is a claim hash, not a judgement of truth. | | 🛰️ Witness Network (new — ecosystem) | a transparency log is only as honest as the assumption the operator shows everyone the SAME history — the split-view attack CT had to solve. The fix: independent witnesses (other vendors, NGOs, clouds) co-sign the log's Signed Tree Head; a relying party trusts it only if a quorum of distinct witnesses co-signed the same root. The operator can no longer present two histories, and any split view is exposed by the conflicting co-signatures. Trust without trusting the operator. Who benefits: the operator (earns checkable honesty), witnesses (a public good + mutual accountability), relying parties (quorum of independents), regulators (proof of one history). (a head co-signed by ≥quorum distinct witnesses is accepted; below quorum rejected; forged or duplicate co-signatures ignored; a witness refuses a non-append-only head; a split view — two roots at one size — is detected 100%) | | 💾 Live Public Log (new — real, persistent) | not a demo — a real, file-backed public transparency log running on the server that survives restarts (identical Merkle root + signing key), with independent witnesses co-signing its current tree head and a live monitor (size · root · latest claims · witness quorum). Submit a claim and watch it appended. Who benefits: submitters (a permanent record that survives deploys), auditors (an old tree head still checks out), monitors (watch a live growing log), everyone (log + witnesses make AI claims accountable). (durableGauntlet: after a simulated restart the log rebuilds to the identical root + key; inclusion + append-only consistency proofs survive the restart; editing the persisted history is detected; a fresh log mints + persists its key once) | | 🚫 Revocation Registry (new — PKI-grade) | every certificate is valid forever — until it should not be (a model certified fair is later found to discriminate; a key is compromised). This is CRL/OCSP for AI: an authority appends a signed, hash-chained revocation (cert hash + reason + effective time); a relying party checks status GOOD / REVOKED before acting. Time-aware — a decision made before the revocation stays valid; only reliance after is blocked. Post it to the transparency log and a revocation cannot be silently dropped. Who benefits: issuer (withdraw a faulty claim, bound liability), relying parties (stop acting on an invalid cert), regulators (require + verify revocation), end users (protected from already-revoked certs). (GOOD/REVOKED with reason+since; time-aware; authority-signed + chain-linked so tamper / forged-revocation / silent un-revoke are caught; only the pinned authority key is trusted) | | ✅ Live Trust Report (new — one signed verdict) | the stack proves many things separately (fairness, calibration, attribution, lineage). A non-expert cannot read eight proofs — they ask one question: "is this AI trustworthy right now?". This composes the whole lifecycle into a single signed verdict: for every member certificate it checks three things at once — it VERIFIES, it is NOT REVOKED as of the reliance time (time-aware), and it is INCLUDED in the public transparency log — and returns TRUSTED-NOW only if every member passes all three, else NOT-TRUSTED-NOW naming the exact member + reason. Revoke one underlying claim and the same bundle flips live. Who benefits: a consumer / procurement (one trustworthy-or-not answer instead of eight proofs), the issuer (a live-good status that already accounts for revocation + logging), regulators (a current signed verdict, not a stale bundle), end users (protected the moment any claim is withdrawn or was never logged). (all-good → TRUSTED-NOW; one revoked/unlogged/tampered member → NOT-TRUSTED-NOW naming it; time-aware; forged TRUSTED-NOW caught on re-derivation; Ed25519-signed, offline-verifiable) | | 🏛️ Chain of Trust (new — AI Certificate Authority) | every certificate assumes the issuer's key is one to trust — but who authorized that issuer? This is the PKI answer for AI claims: a pinned ROOT authority signs a scoped delegation to an intermediate (which cert kinds, which subject namespace, a validity window, a max path length); intermediates may sub-delegate — but only ever narrower — down to a leaf issuer. A relying party pins ONE root key and verifies any issuer was transitively authorized to make exactly this claim. Who benefits: a root regulator sets policy once (every downstream issuer inherits a bounded, checkable mandate); intermediates get provable scoped power; issuers prove they were authorized, not merely that they signed; relying parties trust a whole ecosystem from one pinned key. (in-scope chain → AUTHORIZED; wrong-root / out-of-kind / out-of-namespace / expired (time-aware) / over-delegation (path length) / broken-link / a child broadening its parent / forgery all rejected naming the link; Ed25519-signed, offline-verifiable) | | 🤝 Swarm Evidence | many AI agents each with weak evidence pool into one verdict stronger than any single one — an agent that lies (claims more than its data shows) is re-derived and excluded, and a consensus check (Cochran's Q) flags whether the agents actually agree, so a pooled significance dragged by one disagreeing agent isn't trusted blindly (it names the culprit). Multi-agent trust, signed. (weak gain caught 61% pooled vs 29% single; null ≤ α; a 10⁶× liar excluded 100%; disagreement detected + attributed 100%, false-flagged only 2.8%) | | 📊 False-Discovery Control | report K findings at once and some are pure luck. It controls the fraction of your reported discoveries that are false at a target q, ships a per-hypothesis q-value (usable at any threshold from one signed cert), and offers a Benjamini-Yekutieli mode that holds under arbitrary dependence (the real case — knobs/metrics are correlated). (BH realized FDP ≤ q, measured 7.6%; naive inflates to 13%; q-values match BH at every threshold; BY safe under ρ=0.5 dependence → 1.6% ≤ q) | | ⬛ Null Engine | brave enough to say "there's nothing to find" on pure noise | | 👑 Sovereign Verdict + ⏪ Replay | Ed25519-signed, deterministic, re-derivable on any machine, forever |

curl -X POST https://melete.mneme-ai.space/trust-certificate -d '{"scenario":"good"}'
curl -X POST https://melete.mneme-ai.space/stability         -d '{"scenario":"easy"}'
curl -X POST https://melete.mneme-ai.space/honest-search     -d '{"seed":3}'   # genuine VERIFIES, a fake is REJECTED
curl -X POST https://melete.mneme-ai.space/tolerance         -d '{"scenario":"broad"}'   # certified ±tolerance
curl -X POST https://melete.mneme-ai.space/improvement       -d '{"seed":7}'            # certified gain A→B (independent vs CRN-paired)
curl -X POST https://melete.mneme-ai.space/prereg            -d '{"seed":3}'            # genuine CONFORMS, a cherry-picked run is REJECTED
npx melete-ai poopt proof-of-optimization.json   # verify any signed certificate offline

🔬 Diagnose — plain-language why

| lens | tells you | |---|---| | Sensitivity · cliffs · shape | which knobs matter, where it breaks, the response shape | | Ceiling · drift | the achievable best, and whether results drift over time |

🔌 Integrate — incl. MCP (trust middleware for AI agents)

npm i melete-ai · CLI npx melete-ai … · HTTP https://melete.mneme-ai.space/next /discover /trust-certificate /stability /honest-search /tolerance /improvement /prereg /breakdown /selection /support /fdr /anytime /swarm /conformal /subgroup /calibration /privacy /unlearning /dro /fairness /attribution /receipt /sla /consent /passport /aibom /spotcheck /pca /translog /witness /log/submit /log/monitor /revocation /design /design.md /mcp /verify

🔌 Model Context Protocol — be the verification layer any AI agent plugs into. Any agent (Claude · GPT · Gemini · an autonomous coding agent) calls Melete over MCP and gets back a signed, offline-verifiable answer instead of a number to take on faith — de-bias a winner, check support, control the false-discovery rate, propose the next experiment. Plug-and-play, every result Ed25519-signed.

// Claude Desktop / Cursor MCP config:
{ "mcpServers": { "melete": { "command": "melete-mcp" } } }

…or over HTTP: POST /mcp with a JSON-RPC body (initialize · tools/list · tools/call).

Every tool call is metered + audited into a signed trust ledger — a hash-chained, Ed25519-signed receipt per call (which agent, which tool, the hash of the signed result). POST /mcp/usage returns the tamper-evident usage tally (the number you bill on) + the chain-integrity check. One layer, two jobs: usage-based billing and a shared audit trail every agent and human re-verifies offline.

The moat

  • 🔒 Sovereign — runs air-gapped, on your machine; data never touches a cloud.
  • 👑 Verifiable — every verdict is Ed25519-signed; an auditor re-verifies it offline with the embedded public key, no trust in us required.
  • Replayable — the engine is fully deterministic, so a signed Replay Token re-derives the exact decision, step by step, on any machine, forever.

Honest by design (DIAKRISIS)

Melete is an optimizer + analyst, not a fortune-teller. "Verifiable" means provenance + reproducibility — proof of what was tested and the result reached, unaltered and re-derivablenot a proof that your code is bug-free or exploit-free (that is undecidable in general; we don't claim it). Efficiency, robustness, and Pareto results are exact and reproducible. Run melete gauntlet — every claim is a check you can re-run.