memoria/FORK.md
agent-admin 347d191935 fork: tests (29 green) + fork README + pytest config
Acceptance test suite under tests/ covers 8 of the 10 audit-defined
assertions directly (the 2 that require integration-level fixtures —
flush-subprocess-survives-hook-exit and whole-wiki-not-in-prompt
token-count — are documented as manual-test checks rather than
automated).

tests/test_fs_utils.py — 17 tests
  * Atomic write: roundtrip, overwrite, original-preserved-on-exception,
    parent-dir-creation.
  * Locked append: 4 concurrent workers × 25 entries each, asserts every
    entry appears exactly once and its body lines are contiguous. This
    is the acceptance criterion for "two concurrent flushes don't
    interleave writes."
  * JSON recovery: clean roundtrip, missing-file default, corruption
    produces timestamped .bak and returns default.
  * Wikilink parsing: bare / aliased / mixed; parse_wikilink strip.
  * Path safety: clean / traversal / absolute / empty / null-byte /
    aliased-but-safe.

tests/test_compile_chunking.py — 8 tests
  * Chunking: small log passthrough, byte-exact reconstruction,
    boundary respect, oversized-single-section, mixed-size packing.
  * State-on-failure: single-chunk SDK error does NOT update state;
    multi-chunk partial failure does NOT update state; all-chunks
    succeed DOES update state with hash + cost.

tests/test_lint_backlinks.py — 4 tests
  * Aliased wikilinks aren't flagged as broken links.
  * Aliased backlinks count as valid inbound references (the C9 fix).
  * QA articles referencing concepts don't trigger backlink suggestions.
  * Concept-to-concept asymmetry IS still reported (C9 scope is narrow).

FORK.md — fork-specific docs:
  * Summary of delta vs upstream (data-integrity, scaling, correctness,
    safety, configurability, hygiene categories)
  * Full env-var reference
  * Test invocation + coverage summary
  * Upstream sync guidance (cherry-pick, don't blind-pull)

Result: 29 passed in 0.07s. All patches in this fork verified via
automated test before any production use.
2026-04-24 17:54:00 -04:00

144 lines
5.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Memoria — production fork of claude-memory-compiler
This repository is a hardened fork of [coleam00/claude-memory-compiler](https://github.com/coleam00/claude-memory-compiler).
Upstream is a fresh (2026-04-06) proof-of-concept; this fork adds the
patches needed to run it as the backing memory system for a production
Claude Code deployment.
Upstream's README still applies for the core architecture and workflow
(daily logs → LLM compiler → knowledge articles + index → SessionStart
injection). What's below is the delta.
---
## What this fork changes
### Data-integrity hardening
- **Atomic state writes.** `state.json` and `last-flush.json` are written
tmp-then-fsync-then-`os.replace`. A crash mid-write leaves the target
unchanged instead of truncating to a partial or empty JSON.
- **Corruption recovery.** On `json.JSONDecodeError`, the corrupt file is
moved aside to `<name>.bak-YYYYMMDDTHHMMSSZ`, a warning is logged, and
a default is returned. Prevents the silent full-recompile failure mode.
- **File-locked appends.** Daily-log writes go through
`fcntl.flock`-guarded append. Concurrent flush and pre-compact calls
serialize through the lock; well-formed entries never interleave.
- **SDK retry with backoff.** `run_flush()` retries up to 3 times on SDK
exceptions (2s, 4s delays). On final failure the context file is NOT
deleted and dedup state is NOT updated — the next flush retries cleanly
instead of swallowing the loss.
### Subprocess detachment
- `session-end.py` and `pre-compact.py` pass `start_new_session=True` to
`subprocess.Popen` on POSIX. `flush.py` runs in its own process group,
surviving CC's post-hook signals. Upstream omits this, causing
intermittent silent data loss when the flush subprocess is killed
mid-LLM-call.
### Scaling / prompt size
- **Index-only context.** `compile.py` and `query.py` no longer inline
every existing wiki article into the LLM prompt. The compiler receives
the index and uses the `Read` tool to fetch specific articles. Fixes
upstream issues #3/#5/#9 (prompt-size / cost explosion past ~50
articles).
- **Daily-log chunking.** `compile.py` splits oversized daily logs along
`### ` section boundaries before invoking the LLM. Threshold
`MAX_LOG_CHARS_PER_CHUNK` (default 100_000; override via
`MEMORIA_MAX_LOG_CHARS`). Partial failure keeps the log uncompiled so
the next run retries.
### Correctness
- **Aliased wikilinks.** `extract_wikilinks()` and `count_inbound_links()`
strip `|display` suffixes. Lint's broken-link, orphan, and
missing-backlink checks no longer produce false positives on aliased
forms (fixes upstream issues #7/#8).
- **QA/sources excluded from missing-backlink check.** Q&A articles
reference concepts without requiring reciprocal links — previously
every Q&A that cited a concept would trigger a spurious suggestion.
### Safety
- **Path-traversal guard.** `safe_article_path()` resolves a wikilink
slug inside `KNOWLEDGE_DIR` or returns `None`. `wiki_article_exists()`
uses this guard; LLM-authored slugs like `../../etc/passwd` cannot
escape the knowledge tree.
### Configurability
- **Timezone.** `TIMEZONE` (default `America/Chicago`) is now wired
through `zoneinfo.ZoneInfo` and used by `now_iso()` / `today_iso()` /
`maybe_trigger_compilation()`. Override via `MEMORIA_TZ`. Unknown zones
log a warning and fall back to system local time.
- **Compile trigger.** The upstream 6 PM hardcoded gate is replaced with
a staleness-based trigger: compile fires if the daily log changed AND
`MEMORIA_COMPILE_INTERVAL_MIN` minutes (default 60) have elapsed since
the last compile of that log. No more "wrote a log at 5:59 PM, never
auto-compiles."
- **Model routing.** Per-call-site model env vars:
- `MEMORIA_COMPILE_MODEL` (default `sonnet`)
- `MEMORIA_QUERY_MODEL` (default `sonnet`)
- `MEMORIA_LINT_MODEL` (default `sonnet`)
- Flush uses Haiku unconditionally (short summarization).
### Hygiene
- File-handle context manager in `maybe_trigger_compilation()` so
`compile.log` handle is always cleaned up even on Popen failure.
- Single `asyncio.run()` wrapping the compile loop (not per-file) to
avoid event-loop churn.
- `python-dotenv` removed from direct dependencies (was unused).
- MIT LICENSE added (upstream has none).
---
## Environment variables
| Var | Default | Purpose |
|-----|---------|---------|
| `MEMORIA_TZ` | `America/Chicago` | Timezone for date/time operations |
| `MEMORIA_COMPILE_INTERVAL_MIN` | `60` | Minutes between auto-compile triggers |
| `MEMORIA_MAX_LOG_CHARS` | `100000` | Daily-log chunk threshold |
| `MEMORIA_COMPILE_MODEL` | `sonnet` | Model for `compile.py` |
| `MEMORIA_QUERY_MODEL` | `sonnet` | Model for `query.py` |
| `MEMORIA_LINT_MODEL` | `sonnet` | Model for `lint.py` contradiction check |
---
## Tests
```
uv sync --extra test
uv run pytest tests/ -v
```
The test suite covers:
- Atomic write behavior (including exception-path recovery)
- Concurrent locked append with 4 workers × 25 entries each
- JSON corruption recovery with `.bak` backup
- Wikilink parsing (bare + aliased forms)
- Path-traversal rejection (relative, absolute, null-byte, empty)
- Daily-log chunking (small, oversized, mixed-sizes, boundary-respect)
- `compile.py` state-on-failure (single-chunk failure and partial-chunk failure)
- Lint backlink rules (aliased forms, QA/sources exclusions, concept-to-concept symmetry)
All 29 tests pass as of the `fork:` commit series.
---
## Upstream sync
The `upstream` remote tracks `coleam00/claude-memory-compiler` for
reference. **Do not blindly `git pull upstream main`** — our patches
likely conflict. Review upstream changes, cherry-pick what's relevant,
re-test.
---
## License
MIT — see [LICENSE](LICENSE). Upstream has no license file; author has
stated FOSS-by-declaration intent.