Orchestrator
Decomposes the goal, dispatches subtasks to specialists, supervises the network, and arbitrates handoffs. Holds the plan; never executes inside it.
Exploring agent infrastructure for long-horizon work
An autonomous agent system that decomposes complex goals into parallelized subtasks, self-monitors via OODA loops, and recovers from failure without human intervention — built because single-shot LLM calls collapse under multi-day work and existing agent frameworks stop at demos.
ODIN runs OODA — Observe, Orient, Decide, Act — instead of the now-standard ReAct loop. The split between observation and orientation is what lets the agent ABANDON failing strategies, not just retry them. ReAct keeps trying the same approach until it works or runs out of tokens; OODA forces the agent to re-orient on every cycle, which means a stuck plan gets re-decomposed instead of brute-forced.
Single-model LLM calls are great at well-scoped one-shot tasks and terrible at long-horizon work. The moment a problem requires sustained reasoning across hours, days, or distinct disciplines — research, planning, coding, verification, reporting — a single chat session collapses under its own context window. Information gets dropped. Decisions get re-litigated. The model forgets what it already tried.
Existing agent frameworks largely treat this as a prompt-engineering problem and stop at impressive demos. They lack the substrate that production work actually needs: durable scheduling, structured handoffs between cooperating agents, a context graph that survives a process restart, and a human supervisor who can intervene without burning the whole network down. The interesting work in this space isn't a smarter prompt — it's the operating system underneath.
ODIN treats agents as cooperating processes, not chat sessions. The orchestrator decomposes a goal into a typed plan, dispatches subtasks to specialist agents (Research, Plan, Code, Verify, Report), and supervises the network as it works. Each agent has its own scoped capability set and its own tool surface — the planner doesn't write code, the verifier doesn't talk to the outside world.
Underneath, everything runs on the Amplify runtime — the engine I built to handle the unglamorous parts: a durable scheduler, a cross-session context graph, tool routing with capability negotiation, structured handoffs between agents, and full observability. Amplify is the substrate, ODIN is the product surface. Together they make it possible for a single human operator to supervise dozens of concurrent agent networks without losing the thread.
The operator is a first-class citizen, not an afterthought. Every decision the network makes flows through a console where a human can inspect rationale, approve risky moves, or roll back a branch of the agent graph. ODIN is autonomous, but it's not opaque.
Decomposes the goal, dispatches subtasks to specialists, supervises the network, and arbitrates handoffs. Holds the plan; never executes inside it.
Read-only specialist. Walks the context graph, scrapes external sources, gathers facts before any irreversible action runs. The first agent in almost every plan.
Takes a goal and a set of facts and produces a typed plan — subtasks, dependencies, success criteria. Outputs are reviewable artifacts, not free-form prose.
The only agent allowed to write into its sandbox. Capabilities scoped per task: a documentation task can't touch production secrets, a refactor can't deploy.
Runs tests, checks invariants, validates outputs against the plan's success criteria. If verification fails, the plan re-orients — that's where the OODA loop earns its keep.
ODIN turns a single high-level goal into an explicit plan — a typed graph of subtasks, dependencies, and success criteria — before a single specialist agent runs. Plans are first-class objects: they can be inspected, edited, replayed, and diffed. This is what makes long-horizon work auditable instead of magical.
Every fact an agent learns is written to a typed context graph that outlives any individual session. When an agent picks up a task three hours later — or after a process restart — it walks the graph instead of starting cold. This is the single biggest reason long-horizon work actually completes: the network never forgets what it already knows.
Agents declare what tools they need; the runtime decides what they're allowed to call. A research agent can read; a verifier can run tests; a code agent can write inside its sandbox. Capabilities are negotiated at handoff, not granted globally — so a single compromised step can't escalate across the whole network.
The operator console is a real interface, not a log viewer. Every decision the network makes flows through a feed where a human can pause, approve, or roll back any branch of the plan. ODIN is built to run unattended for hours — but built so that intervening costs seconds, not minutes.
Research, Plan, Code, Verify, Report — each with scoped capabilities negotiated at handoff. No agent has more authority than its current task requires.
Failed plans don't retry — they re-orient. The OODA split lets the orchestrator abandon strategies that aren't working and re-decompose the goal from scratch.
Every plan, handoff, and verification result is a typed object in the context graph. The operator can replay any decision branch and inspect why a path was taken.
© 2026