memoria/FORK.md

# Memoria — production fork of claude-memory-compiler

This repository is a hardened fork of [coleam00/claude-memory-compiler](https://github.com/coleam00/claude-memory-compiler).
Upstream is a fresh (2026-04-06) proof-of-concept; this fork adds the
patches needed to run it as the backing memory system for a production
Claude Code deployment.

Upstream's README still applies for the core architecture and workflow
(daily logs → LLM compiler → knowledge articles + index → SessionStart
injection). What's below is the delta.

---

## What this fork changes

### Data-integrity hardening

- **Atomic state writes.** `state.json` and `last-flush.json` are written
  tmp-then-fsync-then-`os.replace`. A crash mid-write leaves the target
  unchanged instead of truncating to a partial or empty JSON.
- **Corruption recovery.** On `json.JSONDecodeError`, the corrupt file is
  moved aside to `<name>.bak-YYYYMMDDTHHMMSSZ`, a warning is logged, and
  a default is returned. Prevents the silent full-recompile failure mode.
- **File-locked appends.** Daily-log writes go through
  `fcntl.flock`-guarded append. Concurrent flush and pre-compact calls
  serialize through the lock; well-formed entries never interleave.
- **SDK retry with backoff.** `run_flush()` retries up to 3 times on SDK
  exceptions (2s, 4s delays). On final failure the context file is NOT
  deleted and dedup state is NOT updated — the next flush retries cleanly
  instead of swallowing the loss.

### Subprocess detachment

- `session-end.py` and `pre-compact.py` pass `start_new_session=True` to
  `subprocess.Popen` on POSIX. `flush.py` runs in its own process group,
  surviving CC's post-hook signals. Upstream omits this, causing
  intermittent silent data loss when the flush subprocess is killed
  mid-LLM-call.

### Scaling / prompt size

- **Index-only context.** `compile.py` and `query.py` no longer inline
  every existing wiki article into the LLM prompt. The compiler receives
  the index and uses the `Read` tool to fetch specific articles. Fixes
  upstream issues #3/#5/#9 (prompt-size / cost explosion past ~50
  articles).
- **Daily-log chunking.** `compile.py` splits oversized daily logs along
  `### ` section boundaries before invoking the LLM. Threshold
  `MAX_LOG_CHARS_PER_CHUNK` (default 100_000; override via
  `MEMORIA_MAX_LOG_CHARS`). Partial failure keeps the log uncompiled so
  the next run retries.

### Correctness

- **Aliased wikilinks.** `extract_wikilinks()` and `count_inbound_links()`
  strip `|display` suffixes. Lint's broken-link, orphan, and
  missing-backlink checks no longer produce false positives on aliased
  forms (fixes upstream issues #7/#8).
- **QA/sources excluded from missing-backlink check.** Q&A articles
  reference concepts without requiring reciprocal links — previously
  every Q&A that cited a concept would trigger a spurious suggestion.

### Safety

- **Path-traversal guard.** `safe_article_path()` resolves a wikilink
  slug inside `KNOWLEDGE_DIR` or returns `None`. `wiki_article_exists()`
  uses this guard; LLM-authored slugs like `../../etc/passwd` cannot
  escape the knowledge tree.

### Configurability

- **Timezone.** `TIMEZONE` (default `America/Chicago`) is now wired
  through `zoneinfo.ZoneInfo` and used by `now_iso()` / `today_iso()` /
  `maybe_trigger_compilation()`. Override via `MEMORIA_TZ`. Unknown zones
  log a warning and fall back to system local time.
- **Compile trigger.** The upstream 6 PM hardcoded gate is replaced with
  a staleness-based trigger: compile fires if the daily log changed AND
  `MEMORIA_COMPILE_INTERVAL_MIN` minutes (default 60) have elapsed since
  the last compile of that log. No more "wrote a log at 5:59 PM, never
  auto-compiles."
- **Model routing.** Per-call-site model env vars:
  - `MEMORIA_COMPILE_MODEL` (default `sonnet`)
  - `MEMORIA_QUERY_MODEL`   (default `sonnet`)
  - `MEMORIA_LINT_MODEL`    (default `sonnet`)
  - Flush uses Haiku unconditionally (short summarization).

### Hygiene

- File-handle context manager in `maybe_trigger_compilation()` so
  `compile.log` handle is always cleaned up even on Popen failure.
- Single `asyncio.run()` wrapping the compile loop (not per-file) to
  avoid event-loop churn.
- `python-dotenv` removed from direct dependencies (was unused).
- MIT LICENSE added (upstream has none).

---

## Environment variables

| Var | Default | Purpose |
|-----|---------|---------|
| `MEMORIA_TZ` | `America/Chicago` | Timezone for date/time operations |
| `MEMORIA_COMPILE_INTERVAL_MIN` | `60` | Minutes between auto-compile triggers |
| `MEMORIA_MAX_LOG_CHARS` | `100000` | Daily-log chunk threshold |
| `MEMORIA_COMPILE_MODEL` | `sonnet` | Model for `compile.py` |
| `MEMORIA_QUERY_MODEL` | `sonnet` | Model for `query.py` |
| `MEMORIA_LINT_MODEL` | `sonnet` | Model for `lint.py` contradiction check |

---

## Tests

```
uv sync --extra test
uv run pytest tests/ -v
```

The test suite covers:
- Atomic write behavior (including exception-path recovery)
- Concurrent locked append with 4 workers × 25 entries each
- JSON corruption recovery with `.bak` backup
- Wikilink parsing (bare + aliased forms)
- Path-traversal rejection (relative, absolute, null-byte, empty)
- Daily-log chunking (small, oversized, mixed-sizes, boundary-respect)
- `compile.py` state-on-failure (single-chunk failure and partial-chunk failure)
- Lint backlink rules (aliased forms, QA/sources exclusions, concept-to-concept symmetry)

All 29 tests pass as of the `fork:` commit series.

---

## Upstream sync

The `upstream` remote tracks `coleam00/claude-memory-compiler` for
reference. **Do not blindly `git pull upstream main`** — our patches
likely conflict. Review upstream changes, cherry-pick what's relevant,
re-test.

---

## License

MIT — see [LICENSE](LICENSE). Upstream has no license file; author has
stated FOSS-by-declaration intent.