Back to writing
·5 min read

How I Gave My AI a Memory

Every AI session starts from scratch. I built a system that fixes that: a vector database, multi-machine sync, and a core memory injected before the session even starts.

Every AI conversation starts from scratch. You explain your context. You correct the same mistakes you corrected last week. Then you close the tab and it all vanishes.

I'd been living with this for months. The AI I work with is capable: it writes good code, catches mistakes, proposes architecture. But every session, it wakes up with amnesia. It doesn't know what we decided last weekend, what I corrected last week, or why we chose one approach over another.

Then I read an article by Zak El Fassi titled "How Do You Want to Remember?" He showed that structuring how an AI's knowledge is organized boosted recall from 60% to 93%. No model upgrade. Just better architecture.

I ran an audit of my own setup. (The decision to build this came from a structured process I use for bigger questions.) The results were brutal.

The audit

Nine of ten explorations had no persistent knowledge files. The session log was 6,800 lines long and nobody read it. I had 22 files documenting things I'd corrected the AI on. Eleven of them had never been loaded into a session.

The worst finding: 60% of the time, when we made a decision, the reasoning behind it was gone. Not the decision itself. The why. We'd know "we chose two-stage retrieval" but not why we rejected pure vector search.

We weren't just forgetting details. We were losing the ability to learn from our own decisions.

The question that changed everything

I asked the AI something I'd never asked before: "What would you change about how you remember our conversations if you could?"

The answer was specific:

  1. Decisions linked to evidence. Not just "we chose X" but why, with a source.
  2. Compression by topic, not by date. Memory organized by what matters, not when it happened.
  3. Confidence scores. Know what it doesn't know, instead of confidently guessing.
  4. Feedback as first-class data. The 22 corrections I'd given should load before anything else.
  5. A briefing, not a log. A synthesized narrative at session start, not 6,800 raw lines.
  6. A mistakes index. An aviation black-box approach to errors, because mistakes are the fastest path to improvement.

I decided to take the answer seriously and build all six.

The architecture

I named the project Anamnesis. Greek for "recollection." The knowledge already existed across hundreds of documents. The problem was retrieval.

The search layer. SQLite with FTS5 (full-text search). Every document gets chunked and indexed. A query like "why did we choose this database?" returns relevant chunks in under 100 milliseconds.

The embedding layer. Full-text search finds exact word matches. But "deployment configuration" and "server setup" and "infrastructure provisioning" all mean similar things without sharing words. So I added embeddings: each chunk converted into a numerical vector using Google's embedding model. Similar concepts cluster together in vector space. The search does FTS first (fast, exact), then re-ranks by semantic similarity (slower, smarter).

The indexer. A Python script walks all directories, reads every markdown file, decision log, session summary, feedback file, and solution document. It chunks by heading, computes embeddings, and stores everything in SQLite. First build: 639 sources, 14,613 chunks. Rebuilds in about 90 seconds.

The core memory injection. At session start, before I say anything, a hook injects a dynamic identity and context payload: the AI's identity and purpose, what explorations are active, what decisions were made recently, a failure atlas of mistakes to avoid, and every correction I've given. It's the closest thing to a conscience: a sense of self that persists across sessions. This is the difference between passive memory (it can search if it knows to look) and active memory (it already knows what matters before the conversation begins).

Multi-machine sync

I work across three machines: a primary desktop, a secondary Mac Mini that runs overnight automation, and a VPS for monitoring. The memory database needs to stay consistent across all three.

A sync script runs every 15 minutes. It copies the SQLite database from the primary to both replicas. If the primary is unreachable, replicas serve stale data (better than nothing). A hash check on the server code triggers automatic restarts when the codebase changes, so all three machines stay current without manual deployment.

The failover chain: desktop, Mac, VPS. Three origins, one consistent memory.

I also built a secured MCP (Model Context Protocol) server on top of the database, tunneled through Cloudflare. This lets me access the full memory system from cloud-based AI sessions, not just local ones. The memory travels with me.

The result

Before Anamnesis, every session started with 5-10 minutes of context rebuilding. Now the AI arrives knowing what we worked on last weekend, what decisions are pending, what mistakes to avoid. The 22 correction files load before anything else. Session history is compressed into thematic summaries instead of 6,800 raw lines. Decisions carry their reasoning with them.

The AI models keep getting better. Larger context windows, better reasoning, more capable tool use. But the architecture around the model matters just as much as the model itself.

Zak's original insight was right: structuring knowledge into a searchable, semantically-indexed database, not upgrading models, produced the biggest improvement. The same model, with structured memory behind it, went from forgetting 60% of decisions to recalling 92%.

If you work with AI regularly and find yourself re-explaining context every session, the problem probably isn't the model. It's the memory architecture. The knowledge exists. The retrieval is the missing piece.


Anamnesis isn't perfect. But 92% recall is a different world from 40%. And the compound effect means it gets better every week.