AI Agent Architecture

From chatbot to agent fleet — how AI actually works inside a business

Most businesses are using AI as a fancy search engine. The ones pulling ahead are deploying agents — systems that observe, decide, act, and keep going until a job is done.

What is an agent, exactly?

An agent is not a chatbot. A chatbot completes one turn and waits. An agent runs a loop — continuously — until a task is done or a human steps in.

Observe
Reads the current state — user input, tool outputs, memory, context
Decide
Chooses the next action — call a tool, ask a question, or finish
Act
Executes — calls an API, reads a database, drafts an email, writes a record
Repeat
Feeds the result back in and decides the next step. Goes again.

Before building anything, three questions must be answered: What does the agent need to know? What tools does it have access to? And where does the human stay in the loop? Get those three right and you can automate almost anything responsibly.

Four levels of agent capability

Most businesses are at Level 1 without realising it. The leverage — and the competitive moat — is at Levels 3 and 4.

Level 1

Structured Prompt

A single call to an AI model with a carefully engineered system prompt. One input, one output. No loop, no tools. This is what most "AI tools" actually are.

e.g. Generate a meeting summary. Draft a proposal. Classify a support ticket.
Level 2 — Real Agent

Tool-Use Loop

The model is given tools it can call — read a database, send a message, query an API. It decides which tool to use, gets the result, and decides the next step. This is the agent pattern.

e.g. Find all uncontacted leads this week, look up each one in the CRM, and draft personalised follow-up emails.
Level 3 — Orchestration

Orchestrator + Specialists

One orchestrator agent breaks a complex task into sub-tasks and delegates to specialist agents — each with a narrow scope, purpose-built prompt, and specific tool access. Results are synthesised back into a final output.

e.g. Score every inbound sales call: transcription agent scoring agent CRM-writer agent.
Level 4 — Production Fleet

Memory · Gates · Audit

Adds persistent memory (the agent knows your business across sessions), human-in-the-loop gates for irreversible actions, an append-only audit trail, and evaluation pipelines to catch drift.

e.g. The full CAO stack: every function automated, humans reviewing exceptions, board-level reporting generated automatically.

Every live demo, mapped to the architecture

Each product below is the same pattern at a different level — deterministic code owns the facts, the model owns the judgement and language, a human owns every irreversible action. Here's what each one actually runs.

SnapCheck inspection photo

SnapCheck

Vision agent · Level 2

Property condition reporting. Computer vision grades each photo, writes the description, assigns severity and a trade.

PhotoVision agent gradesStructured defectAgent RFQ
Auto: grade + describeIn the showcase
MoveLens seated clinical assessmentISO 11226 · clinical

MoveLens · HomeTask Ergo

Pose + deterministic · Level 2

Clinical movement analysis. Pose estimation extracts 33 landmarks; Python computes every clinical number; the AI only validates keyframes — never invents a figure.

Video33 landmarksPython metricsClinician signs off
Assist: clinician reviewsLive demo

Kesher OS — Legal

Orchestrator + 5 specialists · Level 4

A family-law & mediation back office. A coordinator safety-screens every message before the model, then routes to intake, conflict & safety, drafting, scheduling or billing.

EnquirySafety gate (code)SpecialistPractitioner decides
Critical AI bypassed5-diagram case study

Property Agent

3-agent fleet · Level 3

A property-management desk for Gold Coast & Bondi portfolios. Arrears, maintenance and renewals agents each draft work and queue it for one-click approval.

Portfolio data3 specialists draftPM approves send
Nothing sends unapprovedRun the agents

BiffCoin Desk

Trading orchestrator · Level 3

Five specialist analysts (trend, range, breakout, accumulation, risk) vote on the market regime. The orchestrator decides in deterministic Python; advisory only.

Market snapshot5 specialists voteAny live trade gated
Fails closed on bad dataIn the showcase

LiftAI'd

Pose coaching · Level 2

The gym-facing sibling of MoveLens. Counts reps, measures joint angles, scores form against reference patterns — the numbers are computed, the AI only coaches.

Lift videoPose + anglesForm scoreCoaching note
Auto: rep + angle metricsSee LiftAI'd

Diagnostic Tool

Single structured agent · Level 1

The simplest level: one well-structured call. Paste a business description; get the top 3 automation opportunities and a 90-day roadmap. The foundation everything else builds on.

Business descriptionStructured promptDiagnostic report
Operator reviews outputRun a diagnostic

Inside a $50M business — four processes, rebuilt AI-first

On a $50M business, SG&A is typically $10–15M — much of it labour in repeatable functions. These four workflows are where the hours bleed.

01

Score every inbound sales call

Recording lands in storage. Transcription agent converts speech to text. Scoring agent rates the call against the firm's qualification criteria (budget, timeline, authority, need). CRM-writer agent logs the score and surfaces follow-up actions. Escalation email drafted — human approves before send.

Auto: transcribe + score + log Gate: send escalation email
8–12 hrs/wksaved per sales team
02

Reconcile vendor invoices

Invoice arrives by email. PDF-parser agent extracts line items. Matcher agent cross-references against purchase orders in the accounting system. Anomaly-detector agent flags mismatches and exceptions. Matched invoices auto-approved for payment; unmatched invoices routed to finance with a summary of the discrepancy.

Auto: extract + match + approve clean invoices Gate: flag anomalies for human review
15–20 hrs/wksaved in finance
03

Triage and respond to support tickets

New ticket arrives. Classifier agent assigns priority and category. For tier-1 queries (FAQ-level), a responder agent drafts and sends the reply immediately. For tier-2, a draft is prepared and queued for human review before sending. Routing agent assigns unresolved tickets to the right team member with context pre-populated.

Auto: classify + tier-1 responses Gate: tier-2 drafts need approval
12–16 hrs/wksaved in support
04

Draft the monthly board update

Scheduled cron fires on the 1st of each month. Data-collector agent pulls KPIs from finance, sales, and ops systems. Analyst agent compares against targets and prior period, flags variances above threshold. Writer agent structures the board pack with commentary for each section. CEO receives a draft for review — edits and approves before distribution.

Assist: draft + variance commentary Gate: CEO approves before send
2 days/mosaved for leadership

All savings are indicative estimates based on typical SME staffing ratios. Actual impact depends on process complexity, data quality, and implementation approach.

How the pieces connect

Every production agent deployment follows this structure — from the trigger that starts the job to the audit trail that proves it was done right.

TRIGGER

Webhook · Scheduled cron · User action · Email arrival
Returns a job ID immediately. Agent processes asynchronously.

ORCHESTRATOR AGENT

Reads memory breaks task into sub-tasks delegates to specialists synthesises final output

Specialist A

Narrow scope. Focused system prompt. Specific tools only.

Specialist B

Narrow scope. Focused system prompt. Specific tools only.

Specialist C

Narrow scope. Focused system prompt. Specific tools only.

Human Gate

HIGH-risk actions pause here. Human approves or rejects.

Memory Store

Business context persists across sessions. Agent reads before acting, writes after.

Audit Log

Append-only. Every action logged: agent, tool, input, output, timestamp.

The 70% Principle — what AI must never own

AI should handle the 70% that is repetitive, rules-heavy, and judgement-light. The remaining 30% requires human accountability — not because the AI can't produce an output, but because the consequences of being wrong require a human to own them.

AI owns this (the 70%)

Data extraction and entry — parsing invoices, pulling CRM records, populating fields

Classification and scoring — leads, tickets, calls, documents

First-draft communication — emails, reports, proposals (human reviews before send)

Monitoring and alerting — watching for anomalies, flagging exceptions

Routine approvals — matched invoices, standard leave requests, tier-1 support

Humans must own this (the 30%)

Strategic direction — where the business is going and why

Key relationships — trust, empathy, negotiation, conflict resolution

Legal and compliance sign-off — final approval on contracts, regulatory filings

Performance management — hiring, firing, performance conversations

Crisis judgment — decisions under pressure with incomplete information

The 90-day playbook

Land one workflow. Prove the number. Build trust. Then scale. The technology is 30% of the job — rebuilding the workflow and getting people to use it is 70%.

Weeks 1–4

Diagnose & Quick Win

  • Map every manual, repetitive workflow
  • Baseline each: volume, time, error rate, cost
  • Pick one high-leverage, low-risk process
  • Deploy Level 1: structured prompt, first measurable result
Months 2–3

Foundation

  • Add tool connections: CRM, accounting, comms
  • Wire the tool-use loop (Level 2)
  • Implement audit logging from day one
  • Prove the ROI number — present with evidence
Months 3–6

Scale

  • Split into orchestrator + specialists (Level 3)
  • Add persistent memory across sessions
  • Human-in-the-loop gates for all HIGH-risk actions
  • Expand to second and third function

See where your business should start

Paste a description of your workflows. Get a structured diagnostic with your top 3 automation opportunities and a prioritised 90-day roadmap.