Back to writing
·4 min read

My COO Is an AI

I structured my AI assistant like an organization: a chief of staff, two advisors, and domain experts. It sounds absurd. It works.

I experiment with technology on weekends. Multiple explorations running in parallel, each at a different stage. Keeping track of what I decided, why, and what's stalled across all of them was becoming impossible.

So I did what felt natural after years of managing a large account with multiple stakeholders: I gave my AI assistant an organizational structure.

One point of contact

Most people use AI as a question-and-answer tool. You ask, it responds, you move on. I wanted something different: a single AI that I talk to, that knows all my explorations, and that coordinates with specialized agents and advisors when the question is bigger than what it can handle alone.

I talk to one agent. It decides who else needs to be involved. Sometimes the answer is nobody. Sometimes it convenes two advisors with different perspectives. Sometimes it pulls in a domain specialist. But I never have to think about routing. One conversation, one point of contact.

The structure

Me: Direction, priorities, judgment calls. I decide what to explore and why.

The AI (chief of staff role): Cross-exploration awareness, proactive alerts, institutional memory, operational coordination. It tracks whether things are progressing, what's falling through the cracks, and what the data says about decisions I've already made.

Most questions, the chief of staff handles alone. It has enough context from session history and decision logs to give a solid answer. But for bigger decisions, it knows when to escalate.

It coordinates with two specialized advisors, each with a different thinking style:

  • The Theorist grounds discussions in research and frameworks. Cites cognitive psychology, systems theory, established methodology.
  • The Validator builds evaluation frameworks. Where the Theorist says "the research suggests X," the Validator says "here's how we'd test whether X actually works, with measurable criteria."

The advisors respond independently to avoid anchoring bias, then the chief of staff synthesizes, notes disagreements, and gives his own recommendation. One agent coordinating specialists, not a committee.

How a decision actually flows

Here's an example. I'd been noticing that my AI kept forgetting decisions between sessions, repeating mistakes I'd already corrected, losing the reasoning behind choices. I asked the chief of staff: "Should we build a proper memory system, or just accept the amnesia and re-explain things each time?"

Step 1: The chief of staff reviews context. Runs an audit and finds significant gaps: most explorations have no persistent knowledge, correction files aren't being loaded, and decision reasoning is routinely lost.

Step 2: Convenes the advisors with the audit data.

Step 3: The Theorist maps the problem onto the Atkinson-Shiffrin memory model from cognitive psychology. Session logs are raw sensory input that decays fast. What's missing is the encoding step: filtering raw input into something retrievable. Recommends a centralized semantic index with federated context stores.

Step 4: The Validator builds a measurement framework. Six metrics including recall accuracy, citation rate, and false memory rate. Defines concrete "done" criteria for each phase: "100% feedback files indexed, decisions loaded at startup, baseline dashboard live." Specifies failure modes and how to detect them.

Step 5: The chief of staff synthesizes. Both advisors agree the problem is retrieval, not storage. Recommends a phased build starting with the highest-value files (those 22 corrections). Proposes a 60-question evaluation suite to measure improvement.

Step 6: I decide. Agree with the phased approach. Build starts that weekend. The result was dramatic.

Total time for me: about 15 minutes of reading and deciding. The system did the research, the framework building, and the synthesis. I did the judgment.

Every session like this gets logged as dated minutes. Raw advisor opinions preserved. Synthesis kept separate. Decisions and action items explicit.

What this is not

This is not a chatbot with a fancy title.

A chatbot answers questions when asked. My system reads context proactively, notices when things stall, flags forgotten follow-ups, and connects patterns across explorations. A chatbot forgets when you close the tab. This system's memory spans sessions through versioned knowledge files.

This is also not "AI replacing thinking." The system produces zero value without me making decisions and exercising judgment. What it replaces is the cognitive overhead of keeping track of everything.

The compound effect

That memory architecture session didn't just produce a decision. It produced minutes that became institutional memory, searchable context for every future decision. This is compound engineering in practice: each session makes the next one smarter.

Such an elaborate organization for weekend tech musings sounds absurd. But it works.


I didn't set out to build an org chart for my AI. I set out to stop forgetting things. The structure emerged because it was useful.