For a series of design documents and principles about AGN, please refer to my personal website’s DevMTRX’s subpage.

AGN is under development and intensive testing. For very detailed introduction to AGN, please refer to the link I attached above. I’ll also write a blog specifically about this at Undertide’s Developer section after AGN is finsihed. Right now, I’ll be focusing on the process.

1. Overview of AGN

I define AGN as an event-driven engineering network, not a “chat tool.” The technical motivation is simple: if I ask multiple models to collaborate on real software work without an auditable control flow, replayable evidence, recoverable state, and enforceable role boundaries, the system inevitably becomes a black box. A black box cannot be maintained long-term, cannot be upgraded safely, and cannot be trusted for core engineering output.

AGN is designed to solve four concrete problems:

  1. From black-box to traceable system Once I issue an instruction, every internal step must be inspectable: who did what, when, why, what changed, and what evidence supports the outcome.

  2. From one-shot responses to continuous progress The Coordinator must not stop before delivery. It must behave like a persistent agent with “heartbeat”: it travels between Executor and Reviewer, aggregates findings, refines conclusions, and pushes the loop forward. Silence must not stall the system.

  3. From “it runs” to “it runs for years” I will not accept uncontrolled caches, garbage accumulation, or environmental drift that pollutes future work. AGN must be budgeted, cleanable, gated by compatibility checks, and recoverable after failures.

  4. From token dumping to reference-first communication Information flow must remain precise and efficient. SSOT matters, but SSOT cannot mean shoving huge payloads into prompts.

AGN’s composition is strict role separation with enforceable constraints: the Coordinator orchestrates and advances state, the Executor is the only role that writes code, and the Reviewer supervises quality and produces verdicts. Most importantly, the Admin is an inseparable part of AGN. AGN is not a fully autonomous network. It requires me as the continuous controller and macro-level supervisor. I have absolute control and final authority at every stage, and I accept the corresponding responsibility: ensuring AGN’s growth, maintenance, upgrades, and stability.

Kirara is my personal assistant and belongs outside AGN. She can browse, handle email, manage schedules, organize files, write journals, etc., but must have zero contact with AGN: no AGN repos, no SSOT, no scratch, no logs, no providers, no scripts. AGN is a sealed system; variables must be minimized.


2. AGN State Before Evo1

I do not have enough complete context in today’s materials to reconstruct the full code-level state before Evo1 without guessing, so I will not invent details. What is clear is that early on I identified black-box behavior, boundary violations, and information bloat as major risks—and I chose a “black-box, experiment-first” methodology: suspect, test, verify, then fix, rather than relying on static speculation.


3. Evo1 to Evo5: What Each Stage Does

Evo1: Enforce roles and boundaries as executable constraints

Evo1 turns role separation from a verbal agreement into system-enforced reality. The network no longer relies on model self-discipline: it uses gatekeeping at execution entry points, auditing, and approval-style flows to ensure actions remain within bounds. The macro-level result is that role permissions, Role Guard, an approval channel, and auditing become real infrastructure—necessary foundations for heartbeat, event sourcing, and eventual remote operation.

Evo2: Event-driven heartbeat closed-loop and merged

Evo2 is the major merged milestone. It shifts AGN from “run once and stop” into “event-driven, heartbeat-based continuation”: the Coordinator wakes repeatedly, advances the state machine, and converts silence into events that trigger recovery instead of stalling.

Evo2 also resolves a concrete failure mode I explicitly identified: large payload inlining causing dispatch/prompt/log bloat. The system adopts reference-first messaging plus explicit size budgets. It formalizes domain separation (Repo/SSOT/Scratch) and performance budgets so long-running operation does not pollute the environment. Most importantly, Evo2 establishes reproducible, one-command regression validation with artifacts and documents on disk: whether the system is correct is determined by experiments and evidence, not narrative.

Evo3: Coordinator separation and remote-ready orchestration skeleton

Evo3 is about moving the Coordinator from “local convenience” toward an “environment-agnostic orchestrator.” This enables a future where the true Coordinator runs remotely in OpenClaw as an API model without breaking execution.

Macro-level Evo3 events:

  • A backend abstraction that decouples orchestration logic from where it runs.
  • A state snapshot mechanism that compresses remote inputs into a bounded digest plus refs, rejecting raw payload inlining.
  • Read-by-action: a remote Coordinator can only obtain context via READ actions executed by Runner, producing READ_RESULT events—no implicit local reads.
  • A stronger heartbeat flow with the beginnings of control-plane preemption, enabling me to pause/stop/modify during execution rather than only at the end.

Evo3’s conclusion is that Kirara is not part of the orchestration chain. Legacy Kirara-related scripts may still exist in the repository, but the Evo3 Coordinator contract/heartbeat/backend/regression chain no longer depends on Kirara.

Evo4: Collapse remaining variables—remote purity, ref semantics, portability

Evo4 is not executed yet, but its purpose is already defined: it is a convergence stage, not a feature stage. It resolves the remaining system variables from Evo3: remote semantic purity, removal of path-based ref semantics, full portability of evidence and reports, and elimination of implicit IO paths in read-by-action. When Evo4 completes, the Coordinator can truly migrate to OpenClaw and behave consistently across machines.

Evo5: Delivery gate, recovery policy, and lifecycle governance

Evo5 is the long-term operations stage. Evo2–Evo4 make the platform run, migrate, and replay; Evo5 makes it sustainable and lower-maintenance.

Evo5 introduces three macro-level pillars:

  1. Evidence-driven Delivery Gate: no evidence refs, no DELIVERED. The system loops back automatically until acceptance is satisfied.
  2. RecoveryPolicy + EscalationPolicy: retries and degradations become configurable, with explicit stopping conditions leading to NEED_ADMIN instead of infinite loops.
  3. Lifecycle Governance: SSOT/artifacts get tiered retention, indexing, and periodic integrity sweeps so long-term growth remains controllable.

4. Development Summary and Future Maintenance Plan

The current state is straightforward: I have moved AGN from concept to a real engineering system. Evo2 being merged means heartbeat, budgets, domain isolation, evidence, and regression validation are real. Evo3 being complete means remote-ready orchestration structure exists and Kirara is no longer part of the AGN dependency chain. Next, I will execute pre-Evo4 and Evo4 to eliminate remaining semantic and portability variables, and then proceed to Evo5 to strengthen delivery gating and lifecycle governance.

The most important principle is that AGN is not meant to be “fully automatic.” As Admin, I am an inseparable part of AGN. I have absolute authority, and I accept absolute responsibility. I must continuously supervise at the macro level, set constraints, define acceptance, decide merges, decide stops, decide direction changes, and decide when to enter the next stage. AGN’s growth and stability are my responsibility system.

My future maintenance plan can be reduced to three invariants:

  1. Every change must be experimentally verifiable Each evolution stage must have a one-command regression entrypoint and artifact-backed evidence. Correctness is decided by experiments and refs.

  2. Role boundaries must remain strict and self-checked Executor is the only role that writes code; Reviewer is read-only; Coordinator only orchestrates. Any “small exceptions” become long-term maintenance debt.

  3. Long-term cost must be budgeted, recoverable, and indexable Performance budgets and domain isolation are baseline. Events and artifacts must support archiving, indexing, and integrity sweeps.

Evo4 will lock remote purity and ref semantics. Evo5 will make delivery evidence-gated, formalize recovery/stop conditions, and introduce lifecycle governance. UI clients can evolve independently outside the core network by consuming the event stream and refs without contaminating AGN’s core. If I keep that discipline, AGN can become a long-lived system aligned with my own lifecycle rather than a one-off project.