Self-evolving knowledge base for Claude Code — fork of coleam00/claude-memory-compiler, hardened for production use.

Find a file

agent-admin 39ab2a8b6f fork: MIT LICENSE + foundation patches (atomicity, locking, safety) This is the initial fork commit for agent-admin/memoria, a production hardening of coleam00/claude-memory-compiler. It addresses all four P0 findings from the bug audit (atomic state writes, file locking on daily log appends, subprocess detachment, path-traversal guard) plus several P1s (aliased wikilinks, timezone wiring, staleness-based compile trigger, SDK retry with backoff, file-handle context manager). File-level changes: - LICENSE — MIT (fork is self-declared FOSS; upstream has no LICENSE file but author has stated FOSS intent). - pyproject.toml — renamed project to `memoria`, removed unused python-dotenv dependency, added optional `test` dep group. - scripts/fs_utils.py — NEW module containing the primitives that the other patches rely on: * atomic_write_text(path, content): tmp + fsync + os.replace; interrupted writes leave the target unchanged. * locked_append_text(path, content): fcntl.flock (POSIX) / msvcrt.locking (Windows) exclusive lock around the write so concurrent callers never interleave. * extract_wikilinks / parse_wikilink: strip [[target\|display]] aliases correctly (fixes upstream issues #7 and #8). * safe_article_path(link, base): resolves a wikilink slug inside a base dir or returns None (path traversal guard). * load_json_with_recovery(path, default): on corruption, moves the bad file aside with a timestamped .bak-YYYYMMDDTHHMMSSZ suffix, logs a warning, returns the default. Replaces the silent `{}` return that would otherwise cause full-recompile. - scripts/utils.py — save_state/load_state now use atomic writes and corruption recovery; wiki_article_exists + count_inbound_links now alias-aware via fs_utils helpers. - scripts/config.py — TIMEZONE is now wired via zoneinfo.ZoneInfo and used by now_iso/today_iso (previously defined but ignored). Overridable via MEMORIA_TZ env var. Unknown zones log a warning and fall back to system local time rather than crashing. - scripts/flush.py — * save_flush_state / load_flush_state use atomic + recovery. * append_to_daily_log uses locked_append_text; concurrent flush + pre-compact calls can no longer interleave log entries. * run_flush retries SDK failures up to MAX_SDK_ATTEMPTS=3 with exponential backoff (2s, 4s) before returning FLUSH_ERROR. * On FLUSH_ERROR, main() preserves the context file and does NOT update dedup state — the next flush retries cleanly instead of the failure being silently swallowed. * Explicit model="haiku" for flush (short summarization task). * maybe_trigger_compilation replaced: 6 PM wall-clock gate is gone; trigger is now staleness-based (hash changed AND COMPILE_INTERVAL_MIN elapsed since last compile). Configurable via MEMORIA_COMPILE_INTERVAL_MIN. Uses _now_local() from config so the clock respects the configured timezone. * compile.log handle uses a `with open()` context manager so the fd is always cleaned up, even if Popen throws. - hooks/session-end.py, hooks/pre-compact.py — subprocess.Popen now passes start_new_session=True on POSIX, detaching flush.py from the hook's process group so it survives post-hook SIGHUP. Fixes the intermittent-data-loss failure mode where flush subprocess was killed mid-LLM-call. Tests (formal acceptance suite still to come in this phase): each helper verified via unit exercise in scratch directories — atomic roundtrip, corruption recovery with .bak creation, alias parsing, path-traversal rejection. Upstream issue mapping: #3/#5/#9 addressed by the next commit (compile.py + query.py scaling fix). #7/#8 addressed here via alias-aware helpers. License (#11) resolved via MIT LICENSE.		2026-04-24 17:44:07 -04:00
.claude	Claude Code Memory Compiler	2026-04-06 09:26:30 -05:00
hooks	fork: MIT LICENSE + foundation patches (atomicity, locking, safety)	2026-04-24 17:44:07 -04:00
scripts	fork: MIT LICENSE + foundation patches (atomicity, locking, safety)	2026-04-24 17:44:07 -04:00
.gitignore	URL change for repo in README	2026-04-06 14:46:55 -05:00
AGENTS.md	Claude Code Memory Compiler	2026-04-06 09:26:30 -05:00
LICENSE	fork: MIT LICENSE + foundation patches (atomicity, locking, safety)	2026-04-24 17:44:07 -04:00
pyproject.toml	fork: MIT LICENSE + foundation patches (atomicity, locking, safety)	2026-04-24 17:44:07 -04:00
README.md	URL change for repo in README	2026-04-06 14:46:55 -05:00
uv.lock	Claude Code Memory Compiler	2026-04-06 09:26:30 -05:00

README.md

LLM Personal Knowledge Base

Your AI conversations compile themselves into a searchable knowledge base.

Adapted from Karpathy's LLM Knowledge Base architecture, but instead of clipping web articles, the raw data is your own conversations with Claude Code. When a session ends (or auto-compacts mid-session), Claude Code hooks capture the conversation transcript and spawn a background process that uses the Claude Agent SDK to extract the important stuff - decisions, lessons learned, patterns, gotchas - and appends it to a daily log. You then compile those daily logs into structured, cross-referenced knowledge articles organized by concept. Retrieval uses a simple index file instead of RAG - no vector database, no embeddings, just markdown.

Anthropic has clarified that personal use of the Claude Agent SDK is covered under your existing Claude subscription (Max, Team, or Enterprise) - no separate API credits needed. Unlike OpenClaw, which requires API billing for its memory flush, this runs on your subscription.

Quick Start

Tell your AI coding agent:

"Clone https://github.com/coleam00/claude-memory-compiler into this project. Set up the Claude Code hooks so my conversations automatically get captured into daily logs, compiled into a knowledge base, and injected back into future sessions. Read the AGENTS.md for the full technical reference on how everything works."

The agent will:

Clone the repo and run uv sync to install dependencies
Copy .claude/settings.json into your project (or merge the hooks into your existing settings)
The hooks activate automatically next time you open Claude Code

From there, your conversations start accumulating. After 6 PM local time, the next session flush automatically triggers compilation of that day's logs into knowledge articles. You can also run uv run python scripts/compile.py manually at any time.

How It Works

Conversation -> SessionEnd/PreCompact hooks -> flush.py extracts knowledge
    -> daily/YYYY-MM-DD.md -> compile.py -> knowledge/concepts/, connections/, qa/
        -> SessionStart hook injects index into next session -> cycle repeats

Hooks capture conversations automatically (session end + pre-compaction safety net)
flush.py calls the Claude Agent SDK to decide what's worth saving, and after 6 PM triggers end-of-day compilation automatically
compile.py turns daily logs into organized concept articles with cross-references (triggered automatically or run manually)
query.py answers questions using index-guided retrieval (no RAG needed at personal scale)
lint.py runs 7 health checks (broken links, orphans, contradictions, staleness)

Key Commands

uv run python scripts/compile.py                    # compile new daily logs
uv run python scripts/query.py "question"            # ask the knowledge base
uv run python scripts/query.py "question" --file-back # ask + save answer back
uv run python scripts/lint.py                        # run health checks
uv run python scripts/lint.py --structural-only      # free structural checks only

Why No RAG?

Karpathy's insight: at personal scale (50-500 articles), the LLM reading a structured index.md outperforms vector similarity. The LLM understands what you're really asking; cosine similarity just finds similar words. RAG becomes necessary at ~2,000+ articles when the index exceeds the context window.

Technical Reference

See AGENTS.md for the complete technical reference: article formats, hook architecture, script internals, cross-platform details, costs, and customization options. AGENTS.md is designed to give an AI agent everything it needs to understand, modify, or rebuild the system.