Self-evolving knowledge base for Claude Code — fork of coleam00/claude-memory-compiler, hardened for production use.
Find a file
agent-admin 03296be47a fork: scaling fixes (index-only context + chunking + model wiring)
Fixes upstream issues #3/#5/#9 (whole-wiki in every prompt) and adds
large-log chunking. Addresses the audit's P1 scaling findings (C1),
the chunking requirement operator added on top, C8 explicit model
wiring across all LLM call sites, and D3 single-event-loop refactor.

## compile.py

- **Index-only context.** The `existing_articles_context` concatenation
  of every wiki article has been removed from the prompt. Instead the
  LLM receives only the index + schema + daily log and uses the Read
  tool (already in allowed_tools) to fetch specific articles it decides
  are relevant. Prompt size stays bounded regardless of KB growth —
  upstream's 250K-token prompts past ~100 articles are gone.

- **Chunking.** `_split_log_into_chunks()` splits oversized daily logs
  along `### ` section boundaries. Threshold MAX_LOG_CHARS_PER_CHUNK
  (default 100K chars ≈ 25K tokens, configurable via
  MEMORIA_MAX_LOG_CHARS). Chunks compile via separate LLM calls that
  naturally merge through Edit on shared files. Oversized single
  sections emit as their own chunks rather than splitting mid-thought.

- **Atomic state on chunked compile.** State is only written after
  ALL chunks succeed — partial-failure leaves the log flagged as
  uncompiled in state.json so the next run retries it cleanly. Was
  already correct for single-chunk logs (early return on SDK error)
  and now correct for multi-chunk too.

- **Explicit model.** `model=COMPILE_MODEL` passed to
  ClaudeAgentOptions. Default "sonnet"; override via
  MEMORIA_COMPILE_MODEL env var.

- **D3: single asyncio.run.** The per-file `asyncio.run()` in the
  compile loop is replaced with one outer call wrapping `_compile_all`.
  Avoids repeated event-loop setup/teardown and matches the pattern
  used for async resources in the SDK.

## query.py

- **Index-only context.** `read_all_wiki_content()` replaced with
  `read_wiki_index()`. The LLM reads the index and uses its Read tool
  to fetch specific articles. Same rationale as compile.py — keeps
  prompt size bounded and cost predictable.

- **Explicit model.** `model=QUERY_MODEL`, default "sonnet", override
  via MEMORIA_QUERY_MODEL.

## lint.py

- **C9: skip qa/sources in missing-backlink check.** Articles under
  qa/ or sources/ no longer trigger a suggestion that every referenced
  concept should backlink to them. Concepts aren't expected to link
  back to every Q&A that mentions them — doing so would drown real
  relationships.

- **Alias-aware backlink detection.** Uses `extract_wikilinks()` to
  parse the target's link list so `[[concepts/foo|Display]]` forms
  count as valid backlinks (previously required exact `[[foo]]` match,
  causing false positives on aliased forms).

- **Explicit model.** `model=LINT_MODEL` in check_contradictions call,
  default "sonnet", override via MEMORIA_LINT_MODEL.

## Verified

- Chunking: 120K-char 3-section log splits into 80K + 40K, reconstructs
  byte-exact. Oversized single section (150K) emits as its own chunk.
  Small log (<100K) returns as single chunk.
- All patched modules import cleanly with expected config values.
- compile_daily_log / query.run_query / flush.maybe_trigger_compilation
  / lint.check_missing_backlinks all callable post-patch.
2026-04-24 17:48:48 -04:00
.claude Claude Code Memory Compiler 2026-04-06 09:26:30 -05:00
hooks fork: MIT LICENSE + foundation patches (atomicity, locking, safety) 2026-04-24 17:44:07 -04:00
scripts fork: scaling fixes (index-only context + chunking + model wiring) 2026-04-24 17:48:48 -04:00
.gitignore URL change for repo in README 2026-04-06 14:46:55 -05:00
AGENTS.md Claude Code Memory Compiler 2026-04-06 09:26:30 -05:00
LICENSE fork: MIT LICENSE + foundation patches (atomicity, locking, safety) 2026-04-24 17:44:07 -04:00
pyproject.toml fork: MIT LICENSE + foundation patches (atomicity, locking, safety) 2026-04-24 17:44:07 -04:00
README.md URL change for repo in README 2026-04-06 14:46:55 -05:00
uv.lock Claude Code Memory Compiler 2026-04-06 09:26:30 -05:00

LLM Personal Knowledge Base

Your AI conversations compile themselves into a searchable knowledge base.

Adapted from Karpathy's LLM Knowledge Base architecture, but instead of clipping web articles, the raw data is your own conversations with Claude Code. When a session ends (or auto-compacts mid-session), Claude Code hooks capture the conversation transcript and spawn a background process that uses the Claude Agent SDK to extract the important stuff - decisions, lessons learned, patterns, gotchas - and appends it to a daily log. You then compile those daily logs into structured, cross-referenced knowledge articles organized by concept. Retrieval uses a simple index file instead of RAG - no vector database, no embeddings, just markdown.

Anthropic has clarified that personal use of the Claude Agent SDK is covered under your existing Claude subscription (Max, Team, or Enterprise) - no separate API credits needed. Unlike OpenClaw, which requires API billing for its memory flush, this runs on your subscription.

Quick Start

Tell your AI coding agent:

"Clone https://github.com/coleam00/claude-memory-compiler into this project. Set up the Claude Code hooks so my conversations automatically get captured into daily logs, compiled into a knowledge base, and injected back into future sessions. Read the AGENTS.md for the full technical reference on how everything works."

The agent will:

  1. Clone the repo and run uv sync to install dependencies
  2. Copy .claude/settings.json into your project (or merge the hooks into your existing settings)
  3. The hooks activate automatically next time you open Claude Code

From there, your conversations start accumulating. After 6 PM local time, the next session flush automatically triggers compilation of that day's logs into knowledge articles. You can also run uv run python scripts/compile.py manually at any time.

How It Works

Conversation -> SessionEnd/PreCompact hooks -> flush.py extracts knowledge
    -> daily/YYYY-MM-DD.md -> compile.py -> knowledge/concepts/, connections/, qa/
        -> SessionStart hook injects index into next session -> cycle repeats
  • Hooks capture conversations automatically (session end + pre-compaction safety net)
  • flush.py calls the Claude Agent SDK to decide what's worth saving, and after 6 PM triggers end-of-day compilation automatically
  • compile.py turns daily logs into organized concept articles with cross-references (triggered automatically or run manually)
  • query.py answers questions using index-guided retrieval (no RAG needed at personal scale)
  • lint.py runs 7 health checks (broken links, orphans, contradictions, staleness)

Key Commands

uv run python scripts/compile.py                    # compile new daily logs
uv run python scripts/query.py "question"            # ask the knowledge base
uv run python scripts/query.py "question" --file-back # ask + save answer back
uv run python scripts/lint.py                        # run health checks
uv run python scripts/lint.py --structural-only      # free structural checks only

Why No RAG?

Karpathy's insight: at personal scale (50-500 articles), the LLM reading a structured index.md outperforms vector similarity. The LLM understands what you're really asking; cosine similarity just finds similar words. RAG becomes necessary at ~2,000+ articles when the index exceeds the context window.

Technical Reference

See AGENTS.md for the complete technical reference: article formats, hook architecture, script internals, cross-platform details, costs, and customization options. AGENTS.md is designed to give an AI agent everything it needs to understand, modify, or rebuild the system.