The wiki your LLM maintains.
Most knowledge bases die because nobody wants to do the bookkeeping. Wikis go stale. Notion graveyards have a wing for every team. RAG resets every conversation. This pattern flips the maintenance economics: the LLM does the bookkeeping, you do the curation, and the knowledge actually compounds. Build it yourself in an evening.
Andrej Karpathy published a gist describing this pattern. We've been running it for a couple of months across two of our own operations and one client. It works. This article is the operator's version: what it is, what it gets you, and the markdown you need to drop in to start using it tonight.
The problem, in one paragraph
You read three articles, sit through a meeting, and review last quarter's report. Each one teaches you something. By next week you remember roughly two of those things. By next quarter, none. The "knowledge base" your team has, if you have one, is a graveyard of pages from 2023 nobody trusts. RAG over your documents helps a little: ask the AI a question, it pulls relevant chunks, gives you an answer, and forgets everything before the next question. Nothing accumulates. Nothing compounds. Every conversation starts from zero.
The reason isn't laziness. It's that the tedious part of a knowledge base isn't the reading or thinking. It's the bookkeeping: updating cross-references, keeping summaries current, noting when a new document contradicts an old one, maintaining consistency across dozens of pages. Humans don't have the patience for it. The wiki dies.
The pattern
Three layers, three operations, two index files. That's it.
Three layers:
- Raw sources. The documents you've curated. Articles, PDFs, transcripts, reports, exports. The LLM reads them. It never modifies them. They're your source of truth.
- The wiki. A folder of LLM-written markdown files. Entity pages (clients, suppliers, products, people). Concept pages (methodologies, recurring patterns, gotchas). Source summaries (one per ingested document). The LLM owns this layer entirely.
- The schema. One file telling the LLM how the wiki is structured, what the conventions are, what to do when ingesting new sources. For Claude Code this is a
CLAUDE.md; for Codex it'sAGENTS.md. This is the configuration that makes the LLM a disciplined wiki maintainer rather than a generic chatbot.
Three operations:
- Ingest. You drop a new source in. The LLM reads it, writes a summary page, updates relevant entity and concept pages, refreshes the index, logs the ingest. One source might touch ten or fifteen wiki pages. The synthesis is done once, then kept current.
- Query. You ask a question. The LLM reads the index first to find relevant pages, drills into those pages, synthesises an answer with citations. The wiki is read-once-then-answered, not re-derived from scratch.
- Lint. Periodically (we run it monthly) the LLM walks the wiki looking for orphan pages, missing cross-references, stale claims, contradictions, and gaps where new ingests would help. It surfaces problems without fixing them, so you stay in the loop on judgment calls.
Two index files at the root of the wiki:
index.md: a catalog of every page with a one-line description. Read first by every query. Updated on every ingest.- A log file (we use the project's existing
NOTES.md, but you can call it anything). Append-only record of what happened: ingests, queries, lint passes. Useful for tracing why a page changed.
Why this works where wikis usually don't
The bookkeeping cost goes from "a person, sometimes, when they remember" to "the LLM, every time, automatically." The LLM doesn't get bored. It doesn't forget to update a cross-reference because it was tired. It can touch fifteen files in one pass without losing focus. The wiki stays maintained because the cost of maintenance is near zero.
You're left with the work that actually requires judgment: deciding what to ingest, asking the right questions, deciding when a contradiction needs resolving, and trusting the synthesis (or not). That's the part you wanted to do anyway.
The other thing that makes this work: the wiki is plain markdown in plain folders. Not a hosted service. Not a graph database. Not a vector store. Files. You can edit them by hand. You can grep them. You can git diff them. If your AI vendor disappears tomorrow, your wiki is still there, fully readable, and another LLM can pick up exactly where the last one stopped.
What you can do with it (individual)
The personal use cases that have stuck for us:
- Going deep on a topic. You're researching something for weeks or months. Books, podcasts, papers, articles. Each one gets ingested. By month two, asking the wiki "what do we know about X" returns a synthesis across forty sources, with citations, instantly. The thesis evolves as you go.
- Reading a long book. File each chapter. Build out pages for characters, themes, plot threads. By the end you have a personal companion wiki, the kind communities spend years building on Tolkien Gateway or fan wikis, but for one reader, in one evening per session.
- Tracking your own goals or health. Journal entries, articles, podcast notes about a habit you're changing. The wiki builds a structured picture of yourself over time without you re-reading old journals to find what you said last March.
- Course notes, hobby deep-dives, trip planning, due diligence. Anything where you're accumulating knowledge over time and want it organized rather than scattered.
What you can do with it (organisation)
For a 2 to 50 person operation, the org-shaped uses we've watched work:
- The team wiki nobody had to assign. Slack threads, meeting transcripts, customer call recordings, project documents. They get ingested as they happen. Customer entity pages accumulate everything we know about each account: what they bought, what they complained about, who their main contact is, what the last call covered. A new joiner reads three pages and is up to speed.
- Customer support knowledge that compounds. Each support analysis (contact reasons, refund patterns, recurring complaints) gets filed. The wiki notices when a complaint pattern is the third one this quarter on the same product. Patterns surface; humans decide.
- Compliance and incident records. Every quality complaint, every supplier incident, every recall. Filed once, cross-referenced forever. When a similar incident happens, the wiki already knows the supplier history.
- Sales conversations that don't get lost. Read.ai or Fireflies transcripts get ingested into a prospects-and-clients wiki. Across calls, the LLM builds out who said what, when, and how the deal evolved. No more "what did we agree last week" questions.
- An institutional memory that survives turnover. The reason a process is set up the way it is. The reason a supplier was switched. The reason a customer is on a custom price. All filed, all cross-referenced, all queryable months or years later.
Crucially, no one is assigned to maintain it. The LLM does that. The team just feeds it (sources in) and asks it questions (queries out). Humans review the LLM's edits when they want. The wiki keeps current because there's no human bottleneck.
Build your own
The whole pattern is markdown in a folder. Here's the minimum viable shape:
your-project/
CLAUDE.md # the schema (or AGENTS.md for Codex)
raw/ # source documents (immutable)
wiki/
index.md # catalog of every wiki page
README.md # conventions for this wiki
entities/ # one page per real-world thing
concepts/ # one page per methodology/pattern
sources/ # one summary page per ingested source
NOTES.md # chronological log
The four files below are the minimum you need. Drop them into your project, point an LLM agent at the project folder, and the pattern works from the next conversation onward.
// 1 OF 4 — CLAUDE.md (the schema)# <your project name>
This project uses the LLM-wiki pattern. The wiki under `wiki/` is a
compounding knowledge base maintained by you (the LLM) on my behalf.
You write the wiki. I read it.
## Layers
- `raw/` is immutable source documents. Read from it; never modify it.
- `wiki/` is your output. Markdown files. You own this folder entirely.
- `NOTES.md` is the chronological log. Append entries; do not overwrite.
## Wiki structure
- `wiki/index.md` is the catalog. Every page is listed with a one-line
description. Read this first on every query.
- `wiki/entities/` holds one page per proper-noun thing
(clients, suppliers, products, people, places, organisations).
- `wiki/concepts/` holds one page per methodology, recurring pattern,
or framework that comes up repeatedly.
- `wiki/sources/` holds one dated summary page per ingested source.
Naming: `YYYY-MM-DD-short-slug.md`.
## Operations
When I say "ingest this", read the source, write a summary page in
`wiki/sources/`, update relevant entity and concept pages, refresh
`wiki/index.md`, and append a `## YYYY-MM-DD ingest` entry to NOTES.md.
When I ask a domain question, read `wiki/index.md` first, drill into
the relevant pages, synthesise an answer with citations to wiki page
paths. If a synthesis is substantial, offer to file it back as a
new concept page.
When I say "lint the wiki", walk every page and surface: orphans
(no inbound links), missing cross-references (entity slugs mentioned
but not linked), stale claims (pages whose updates section is months
behind newer sources), contradictions, gaps. Read-only. I decide what
to fix.
## What does not go in the wiki
- Live data that changes every week. The wiki captures patterns, not
point-in-time numbers.
- Credentials, tokens, customer PII, or anything I have flagged
confidential.
- Things the source documents already cover. The wiki summarises and
cross-references; it does not duplicate.
## Style
Plain markdown. Use `[[wiki-style links]]` for cross-references.
Add YAML frontmatter to pages with type, dates, and source counts.
// 2 OF 4 — wiki/README.md (local conventions)
# <project> wiki
Compounding knowledge base. The LLM writes; I read.
## What goes where
**Entities.** Proper nouns. Things that exist in the real world.
Examples for this project: <list 5 to 10 entity types you expect>.
**Concepts.** Methodologies, recurring patterns, gotchas.
Examples for this project: <list 5 to 10 concept candidates>.
**Sources.** One dated summary per ingested document. Slug format:
`YYYY-MM-DD-short-slug.md`.
## What does NOT go here
- Live data (use the relevant tool).
- Personal/business facts not specific to this domain.
- Credentials.
- Procedural how-tos (those belong in a skill or runbook).
## Operations
- `/wiki-ingest` files a source. Updates entities, concepts, sources,
index, NOTES log.
- `/wiki-query` answers from the wiki, cited.
- `/wiki-lint` monthly health check.
// 3 OF 4 — wiki/index.md (starts empty)
# <project> wiki — index
Last updated: <today>. Maintained by ingest and lint passes.
This is the catalog. Every query reads this first. Keep it tight.
## Entities
_(Empty. Populated by ingests.)_
## Concepts
_(Empty.)_
## Sources
_(Empty.)_
## Open questions
_(Empty.)_
## Open contradictions
_(Empty.)_
// 4 OF 4 — NOTES.md (chronological log header)
# <project> — log
Most recent first. Each entry has a `## YYYY-MM-DD <type> — <title>`
header. Types: `ingest`, `lint`, `session`, `query-synthesis`.
Parseable with: `grep "^## \[" NOTES.md | head -10`
(Entries appended automatically by the operations. Do not hand-edit
unless correcting a typo.)
CLAUDE.md at the start of every conversation and knows what to do.Manage it (what to do, when)
The pattern is self-maintaining if you give it a few hooks:
- Day one to week two. Ingest one source per session. Read the resulting wiki pages. Refine the schema (the conventions in your
CLAUDE.md) based on what works for your domain. The first ten ingests teach you what entity and concept categories make sense for you. - Week two onward. Ingest happens almost reflexively. Most sessions either ingest or query. The wiki accumulates without effort.
- Every month. Run the lint. We schedule it as a monthly cron via Claude Code's scheduled-tasks (other agent platforms have similar primitives). It walks the wiki, surfaces drift, suggests new ingests to close gaps. Read-only by default; you decide what to act on.
- When the wiki gets large. Past a hundred or so pages, the index file alone may not be enough. At that point, add a small markdown search tool (we like qmd, but a 30-line grep wrapper works too) and tell the LLM to use it. You'll know when you need this; until then, the index is fine.
- When new projects spawn. Each project gets its own wiki. They don't merge. Cross-project sources can be referenced from both wikis without duplication.
What this isn't
- Not search. It's a synthesised knowledge base. If you want full-text search across raw documents, that's a different tool (use one in addition).
- Not authentication. The wiki is a folder of files. It has whatever access controls your filesystem has. For team use, put it in a private git repo.
- Not compliance-grade. The wiki captures patterns and synthesis. Compliance records, signed audit trails, and regulated documents stay in the systems that already handle them. The wiki references them; it doesn't replace them.
- Not a replacement for the underlying LLM tools. You still query BigQuery for live numbers, still use your CRM for current contact details, still hit your ticketing system for ticket status. The wiki is the synthesis layer, not the data layer.
- Not opinionated about which agent. Karpathy's gist works with Claude Code, Codex, OpenCode, and others. The schema file format differs (
CLAUDE.md,AGENTS.md); the pattern is the same.
The bigger idea
The pattern echoes something Vannevar Bush proposed in 1945 in his "As We May Think" essay: the Memex, a personal, curated knowledge store with associative trails between documents. Bush's vision was always closer to this than to what the public web became. Private, actively curated, with the connections between documents as valuable as the documents themselves. The piece Bush couldn't solve was who does the maintenance. LLMs do.
For an operator running a 10 to 50 person business, this pattern is the closest thing we've seen to "institutional memory in a box." It's free. It's plain text. It survives vendor changes. And it stops being something you have to remember to maintain.
What WildBreeze does with this pattern
For most operators, the pattern is a self-serve build. The four files above are everything you need.
If you want it wired into your existing tools (Slack feeding ingests, transcripts auto-ingesting, the wiki driving your weekly reports, lint passes that trigger on signal rather than calendar), that's a custom engagement. Three to six weeks, fixed price, the wiki ends up living in your environment, and your team feeds and queries it the way they already work. We've built variants for revenue ops, customer support, supplier management, and prospect tracking. Tell us what you're trying to remember.
Reference: Andrej Karpathy, LLM Wiki, May 2026. The original pattern description.
Related: What is an AI agent? · Giving AI safe access to your business data · Replacing the 11pm spreadsheet