Markdown Wikis Are Replacing RAG — Karpathy's Pattern Explained

For two years, the default answer to “how do I give an LLM my knowledge?” was RAG. Build a vector database. Chunk your documents. Embed them. Run nearest-neighbor search at query time. Stitch the results back into the prompt.

It worked. Sort of. Anyone who’s actually shipped a RAG system knows the failure modes: chunks that lose context, embeddings that retrieve the wrong passage, opaque rankings, no provenance, weird edge cases when the user asks something the index wasn’t tuned for.

In April 2026, Andrej Karpathy posted a workflow that does almost none of that and works better for personal knowledge. He calls it LLM Knowledge Bases. The architecture is just a folder of markdown files, an LLM with filesystem access, and a habit. VentureBeat called it “an evolving markdown library maintained by AI” — a description that captures what’s actually new.

The post-RAG pattern is here. This article explains what it is, why it works, and how Save Vault makes it accessible without any developer setup.

What RAG Was Trying to Solve

The original problem: LLMs have a fixed context window, your knowledge base is bigger than the window, so you need a way to retrieve the relevant slice for each question.

In 2023, vectors were the obvious answer. Embed everything, search by similarity, inject the top-k chunks. It composed nicely with the small context windows of GPT-3.5 and Claude 1. The whole “AI startup” pattern was “RAG over X.”

Three things changed.

Context windows exploded. Claude shipped 1M-token context this year. Gemini and GPT-5 are similar. A million tokens is roughly 750,000 words — enough to hold a small wiki entirely in memory.
Filesystem MCP shipped. LLMs can now open files on disk directly. They don’t need pre-indexed chunks. They can navigate, read, and re-read like a human.
LLMs got better at reading. Claude Opus 4 can ingest hundreds of files in one session and reason across them coherently. The bottleneck moved from “retrieval quality” to “what does the human actually need.”

Once those three things were true, RAG started looking like a workaround for limitations that no longer exist.

What the Markdown Wiki Pattern Looks Like

Karpathy’s setup, simplified:

Raw folder. Every web page he wants to keep gets saved as a .md file in a raw/ directory. He uses Obsidian Web Clipper for this.
Compile pass. Periodically, an LLM agent (Claude Code in his case) reads everything in raw/, generates concept pages, writes summaries, and creates backlinks. This produces a structured wiki on top of the raw material.
Query loop. When he has a question, he asks the LLM. It searches the wiki, opens the relevant files, and answers using the contents.
Lint pass. Occasionally the LLM scans the wiki for inconsistencies, missing data, or new connections worth recording.

His current research wiki is ~100 articles and ~400K words. He asks it complex questions and gets sourced answers back.

There’s no vector database. No embedding model. No chunking strategy. No retrieval ranking. Just markdown files, a folder structure, and an LLM that can read them.

Why It Works Better Than RAG (For This)

The wiki pattern has structural advantages that RAG can’t match without becoming a wiki itself.

Provenance is free. Every answer cites a file. You can open it, read it, edit it, delete it. No “the embedding said so.”

Editing is trivial. A markdown file is text. Open it in any editor. Fix a typo. Add a note. Delete a section. The next query reflects the change immediately. There’s no re-embedding step.

Structure compounds. When the LLM compiles the wiki, it builds backlinks and concept pages. The wiki gets better the more you save, because the LLM has more context to connect new entries to. A vector index just gets bigger.

Portability is total. A folder of .md files works in Obsidian, VS Code, GitHub, Logseq, vim, or cat. A vector database is a black box you need a specific runtime to read.

You can read it yourself. This sounds obvious, but it’s the biggest advantage. You will sometimes want to know what’s in your knowledge base. With RAG, that’s a reporting query. With markdown, it’s ls.

The honest tradeoff: RAG still wins when you have millions of documents, multi-tenant access, or hard latency constraints (think customer support chatbots over a corpus of millions of help articles). For personal knowledge — your reading, your research, your domain — the wiki pattern is now strictly better.

The Missing Piece: Ingestion

Karpathy’s pattern has a quiet assumption: that getting clean markdown into the raw/ folder is easy. For developers who already use Obsidian Web Clipper, it sort of is. For everyone else, this is the step where the workflow dies.

Web Clipper can struggle with paywalled pages, JavaScript-heavy sites, video content, X threads, and anything dynamic. People save garbled HTML, give up, and conclude “the wiki thing isn’t for me.”

The Save extension exists specifically to fix this step. It uses Gemini to extract clean content from arbitrary pages, including:

Articles behind paywalls you have access to
YouTube videos (full transcript + AI summary)
X/Twitter threads
Instagram reels and TikTok captions (transcribed)
Reddit discussions
Documentation with code blocks intact
Dynamic SPAs that traditional clippers choke on

One click. Clean markdown out the other side. Drop it in the folder.

The Other Missing Piece: The MCP Setup

Karpathy’s pattern also assumes you can configure an MCP server. For Claude Code users this is a one-line cd. For everyone using Claude Desktop, it means editing a JSON config file and restarting the app — and getting the path right, and remembering to redo it when you move folders.

Save Vault collapses both missing pieces into one app:

The Save extension feeds clean markdown into Save Vault automatically
Save Vault writes to ~/Documents/Save Vault/ organized into knowledge bases (subfolders)
A built-in MCP server exposes list_knowledge_bases, list_files, read_file, and search to Claude
The “Connect to Claude” toggle in the menu bar wires the MCP server into Claude Desktop and Claude Code, no JSON editing

The result is the Karpathy pattern with the rough edges sanded off. Save a page → it lands in your vault → Claude can answer questions about it. No vector database, no chunking, no embeddings.

What This Looks Like in Practice

Imagine you’re researching a competitor.

Day 1. You save their pricing page, three blog posts, and a Hacker News thread about their seed round. Five files in your Competitors KB.

Day 5. You ask Claude: “What pricing changes has this company made in the last year, and how have customers reacted?” Claude searches your Competitors KB, reads the relevant files, quotes the pricing page, surfaces the HN thread sentiment, and answers — all sourced.

Day 30. You have 40 files across Competitors, Customers, and AI Research. You ask Claude to compile each KB into a wiki. It writes concept pages, links them, flags contradictions. You now have three living wikis you can query like search engines, but better — because they only contain what you curated.

Day 90. Your wikis are bigger than any analyst report you’d buy, more current than any consultant deck, and entirely yours. Every claim is sourced to a file you saved.

This is what a personal knowledge base actually feels like once the friction is gone. RAG was supposed to deliver this and didn’t. The Karpathy pattern does — once the ingestion and MCP pieces are wired together for you.

Try It

Install the Save Chrome extension
Install Save Vault from savemarkdown.co
Toggle Connect to Claude in the menu bar
Save 10 things you’ve been meaning to read
Open Claude and ask a question that stitches them together

That’s the post-RAG workflow. It’s already replacing vector databases for personal knowledge. The only thing left is to start building yours.

Save Vault is free. The Save extension is free for 3 saves a month, $5.99/mo unlimited. savemarkdown.co.

Markdown Wikis Are Replacing RAG — Karpathy's Pattern Explained

What RAG Was Trying to Solve

What the Markdown Wiki Pattern Looks Like

Why It Works Better Than RAG (For This)

The Missing Piece: Ingestion

The Other Missing Piece: The MCP Setup

What This Looks Like in Practice

Try It

## Continue reading

Karpathy Uses Obsidian Web Clipper. Save Vault Does the Same Thing — With Claude Built In

Build a Personal LLM Knowledge Base in 15 Minutes (2026)

Why Markdown Is the Best Format for LLMs and AI Agents

Save Vault Is a Wiki Builder for the Claude Era

Jean-Sébastien Wallez