Fixes upstream issues #3/#5/#9 (whole-wiki in every prompt) and adds
large-log chunking. Addresses the audit's P1 scaling findings (C1),
the chunking requirement operator added on top, C8 explicit model
wiring across all LLM call sites, and D3 single-event-loop refactor.
## compile.py
- **Index-only context.** The `existing_articles_context` concatenation
of every wiki article has been removed from the prompt. Instead the
LLM receives only the index + schema + daily log and uses the Read
tool (already in allowed_tools) to fetch specific articles it decides
are relevant. Prompt size stays bounded regardless of KB growth —
upstream's 250K-token prompts past ~100 articles are gone.
- **Chunking.** `_split_log_into_chunks()` splits oversized daily logs
along `### ` section boundaries. Threshold MAX_LOG_CHARS_PER_CHUNK
(default 100K chars ≈ 25K tokens, configurable via
MEMORIA_MAX_LOG_CHARS). Chunks compile via separate LLM calls that
naturally merge through Edit on shared files. Oversized single
sections emit as their own chunks rather than splitting mid-thought.
- **Atomic state on chunked compile.** State is only written after
ALL chunks succeed — partial-failure leaves the log flagged as
uncompiled in state.json so the next run retries it cleanly. Was
already correct for single-chunk logs (early return on SDK error)
and now correct for multi-chunk too.
- **Explicit model.** `model=COMPILE_MODEL` passed to
ClaudeAgentOptions. Default "sonnet"; override via
MEMORIA_COMPILE_MODEL env var.
- **D3: single asyncio.run.** The per-file `asyncio.run()` in the
compile loop is replaced with one outer call wrapping `_compile_all`.
Avoids repeated event-loop setup/teardown and matches the pattern
used for async resources in the SDK.
## query.py
- **Index-only context.** `read_all_wiki_content()` replaced with
`read_wiki_index()`. The LLM reads the index and uses its Read tool
to fetch specific articles. Same rationale as compile.py — keeps
prompt size bounded and cost predictable.
- **Explicit model.** `model=QUERY_MODEL`, default "sonnet", override
via MEMORIA_QUERY_MODEL.
## lint.py
- **C9: skip qa/sources in missing-backlink check.** Articles under
qa/ or sources/ no longer trigger a suggestion that every referenced
concept should backlink to them. Concepts aren't expected to link
back to every Q&A that mentions them — doing so would drown real
relationships.
- **Alias-aware backlink detection.** Uses `extract_wikilinks()` to
parse the target's link list so `[[concepts/foo|Display]]` forms
count as valid backlinks (previously required exact `[[foo]]` match,
causing false positives on aliased forms).
- **Explicit model.** `model=LINT_MODEL` in check_contradictions call,
default "sonnet", override via MEMORIA_LINT_MODEL.
## Verified
- Chunking: 120K-char 3-section log splits into 80K + 40K, reconstructs
byte-exact. Oversized single section (150K) emits as its own chunk.
Small log (<100K) returns as single chunk.
- All patched modules import cleanly with expected config values.
- compile_daily_log / query.run_query / flush.maybe_trigger_compilation
/ lint.check_missing_backlinks all callable post-patch.