fork: scaling fixes (index-only context + chunking + model wiring)

Fixes upstream issues #3/#5/#9 (whole-wiki in every prompt) and adds large-log chunking. Addresses the audit's P1 scaling findings (C1), the chunking requirement operator added on top, C8 explicit model wiring across all LLM call sites, and D3 single-event-loop refactor. ## compile.py - **Index-only context.** The `existing_articles_context` concatenation of every wiki article has been removed from the prompt. Instead the LLM receives only the index + schema + daily log and uses the Read tool (already in allowed_tools) to fetch specific articles it decides are relevant. Prompt size stays bounded regardless of KB growth — upstream's 250K-token prompts past ~100 articles are gone. - **Chunking.** `_split_log_into_chunks()` splits oversized daily logs along `### ` section boundaries. Threshold MAX_LOG_CHARS_PER_CHUNK (default 100K chars ≈ 25K tokens, configurable via MEMORIA_MAX_LOG_CHARS). Chunks compile via separate LLM calls that naturally merge through Edit on shared files. Oversized single sections emit as their own chunks rather than splitting mid-thought. - **Atomic state on chunked compile.** State is only written after ALL chunks succeed — partial-failure leaves the log flagged as uncompiled in state.json so the next run retries it cleanly. Was already correct for single-chunk logs (early return on SDK error) and now correct for multi-chunk too. - **Explicit model.** `model=COMPILE_MODEL` passed to ClaudeAgentOptions. Default "sonnet"; override via MEMORIA_COMPILE_MODEL env var. - **D3: single asyncio.run.** The per-file `asyncio.run()` in the compile loop is replaced with one outer call wrapping `_compile_all`. Avoids repeated event-loop setup/teardown and matches the pattern used for async resources in the SDK. ## query.py - **Index-only context.** `read_all_wiki_content()` replaced with `read_wiki_index()`. The LLM reads the index and uses its Read tool to fetch specific articles. Same rationale as compile.py — keeps prompt size bounded and cost predictable. - **Explicit model.** `model=QUERY_MODEL`, default "sonnet", override via MEMORIA_QUERY_MODEL. ## lint.py - **C9: skip qa/sources in missing-backlink check.** Articles under qa/ or sources/ no longer trigger a suggestion that every referenced concept should backlink to them. Concepts aren't expected to link back to every Q&A that mentions them — doing so would drown real relationships. - **Alias-aware backlink detection.** Uses `extract_wikilinks()` to parse the target's link list so `[[concepts/foo|Display]]` forms count as valid backlinks (previously required exact `[[foo]]` match, causing false positives on aliased forms). - **Explicit model.** `model=LINT_MODEL` in check_contradictions call, default "sonnet", override via MEMORIA_LINT_MODEL. ## Verified - Chunking: 120K-char 3-section log splits into 80K + 40K, reconstructs byte-exact. Oversized single section (150K) emits as its own chunk. Small log (<100K) returns as single chunk. - All patched modules import cleanly with expected config values. - compile_daily_log / query.run_query / flush.maybe_trigger_compilation / lint.check_missing_backlinks all callable post-patch.
2026-04-24 17:48:48 -04:00 · 2026-04-24 17:48:48 -04:00 · 03296be47a
commit 03296be47a
parent 39ab2a8b6f
3 changed files with 213 additions and 68 deletions
--- a/scripts/compile.py
+++ b/scripts/compile.py
@ -15,6 +15,8 @@ from __future__ import annotations
 import argparse
 import asyncio
 import os
 import re
 import sys
 from pathlib import Path
@ -31,41 +33,81 @@ from utils import (
 # ── Paths for the LLM to use ──────────────────────────────────────────
 ROOT_DIR = Path(__file__).resolve().parent.parent
 # Compilation model (Sonnet by default — knowledge extraction benefits from
 # strong reasoning; override via MEMORIA_COMPILE_MODEL for experiments).
 COMPILE_MODEL = os.environ.get("MEMORIA_COMPILE_MODEL", "sonnet")
-async def compile_daily_log(log_path: Path, state: dict) -> float:
+# Chunk threshold for large daily logs. Anything above ~100K chars gets
-    """Compile a single daily log into knowledge articles.
+# split along `### ` section boundaries so a single LLM call never
 # receives the whole log when it's oversized. Each chunk compiles via a
 # fresh Claude invocation; they merge naturally because all writes go
 # through Edit on shared files (index.md, existing concept articles).
 #
 # 100K chars ≈ 25K tokens — well under Claude's context window even
 # after schema + index + instructions + headroom.
 MAX_LOG_CHARS_PER_CHUNK = int(os.environ.get("MEMORIA_MAX_LOG_CHARS", "100000"))
-    Returns the API cost of the compilation.
+
 def _split_log_into_chunks(log_content: str, max_chars: int) -> list[str]:
    """Split a daily log by ### section headers if it exceeds max_chars.
    Returns a list of chunk strings where each chunk is <= max_chars (unless
    a single section itself exceeds max_chars, in which case the section is
    emitted as its own oversized chunk — preferable to splitting mid-thought).
    If the whole log is <= max_chars, returns a single-element list.
    """
-    from claude_agent_sdk import (
+    if len(log_content) <= max_chars:
-        AssistantMessage,
+        return [log_content]
        ClaudeAgentOptions,
        ResultMessage,
        TextBlock,
        query,
    )
-    log_content = log_path.read_text(encoding="utf-8")
+    # Split at ### boundaries, keeping the header attached to its body.
    parts = re.split(r"(?m)(?=^### )", log_content)
    chunks: list[str] = []
    current = ""
    for part in parts:
        if not part:
            continue
        # If this part alone exceeds max_chars, emit it as its own chunk.
        if len(part) > max_chars:
            if current:
                chunks.append(current)
                current = ""
            chunks.append(part)
            continue
        # If appending would overflow, close out current and start new.
        if current and len(current) + len(part) > max_chars:
            chunks.append(current)
            current = part
        else:
            current += part
    if current:
        chunks.append(current)
    return chunks
 def _build_prompt(log_name: str, chunk_body: str, *, chunk_info: str = "") -> str:
    """Assemble the compile prompt.
    Unlike upstream, we do NOT inline every existing article into the prompt
    — that would send the whole wiki on every call, exploding cost and
    hitting context limits past ~50 articles (upstream issues #3/#5/#9).
    Instead, we provide:
      * the schema (AGENTS.md) — stable structural rules
      * the current index — lets the compiler identify which concepts exist
      * the daily log — the new material to compile
    The compiler uses its Read tool to fetch specific existing articles
    it deems relevant (index has paths + summaries), keeping prompt size
    bounded regardless of knowledge-base size.
    """
    schema = AGENTS_FILE.read_text(encoding="utf-8")
    wiki_index = read_wiki_index()
    # Read existing articles for context
    existing_articles_context = ""
    existing = {}
    for article_path in list_wiki_articles():
        rel = article_path.relative_to(KNOWLEDGE_DIR)
        existing[str(rel)] = article_path.read_text(encoding="utf-8")
    if existing:
        parts = []
        for rel_path, content in existing.items():
            parts.append(f"### {rel_path}\n```markdown\n{content}\n```")
        existing_articles_context = "\n\n".join(parts)
    timestamp = now_iso()
-    prompt = f"""You are a knowledge compiler. Your job is to read a daily conversation log
+    return f"""You are a knowledge compiler. Your job is to read a daily conversation log
-and extract knowledge into structured wiki articles.
+and extract knowledge into structured wiki articles.{chunk_info}
 ## Schema (AGENTS.md)
@ -73,17 +115,19 @@ and extract knowledge into structured wiki articles.
 ## Current Wiki Index
 The index below lists every existing wiki article with a one-line summary.
 When extracting concepts, check this index first. If a concept already
 exists, use the Read tool to fetch its current content and update it
 rather than duplicating. Only fetch articles you actually need — do not
 read the entire wiki.
 {wiki_index}
 ## Existing Wiki Articles
 {existing_articles_context if existing_articles_context else "(No existing articles yet)"}
 ## Daily Log to Compile
-**File:** {log_path.name}
+**File:** {log_name}
-{log_content}
+{chunk_body}
 ## Your Task
@ -91,22 +135,25 @@ Read the daily log above and compile it into wiki articles following the schema
 ### Rules:
-1. **Extract key concepts** - Identify 3-7 distinct concepts worth their own article
+1. **Consult the index first.** Identify which concepts in the daily log
-2. **Create concept articles** in `knowledge/concepts/` - One .md file per concept
+   already have articles (use the Read tool to fetch them) and which are
   new. Do not list or read the whole wiki — only what's relevant.
 2. **Extract key concepts** - Identify 3-7 distinct concepts worth their own article
 3. **Create concept articles** in `knowledge/concepts/` - One .md file per concept
   - Use the exact article format from AGENTS.md (YAML frontmatter + sections)
   - Include `sources:` in frontmatter pointing to the daily log file
   - Use `[[concepts/slug]]` wikilinks to link to related concepts
   - Write in encyclopedia style - neutral, comprehensive
-3. **Create connection articles** in `knowledge/connections/` if this log reveals non-obvious
+4. **Create connection articles** in `knowledge/connections/` if this log reveals non-obvious
   relationships between 2+ existing concepts
-4. **Update existing articles** if this log adds new information to concepts already in the wiki
+5. **Update existing articles** if this log adds new information to concepts already in the wiki
   - Read the existing article, add the new information, add the source to frontmatter
-5. **Update knowledge/index.md** - Add new entries to the table
+6. **Update knowledge/index.md** - Add new entries to the table
   - Each entry: `| [[path/slug]] | One-line summary | source-file | {timestamp[:10]} |`
-6. **Append to knowledge/log.md** - Add a timestamped entry:
+7. **Append to knowledge/log.md** - Add a timestamped entry:
   ```
-   ## [{timestamp}] compile | {log_path.name}
+   ## [{timestamp}] compile | {log_name}
-   - Source: daily/{log_path.name}
+   - Source: daily/{log_name}
   - Articles created: [[concepts/x]], [[concepts/y]]
   - Articles updated: [[concepts/z]] (if any)
   ```
@ -126,13 +173,29 @@ Read the daily log above and compile it into wiki articles following the schema
 - Sources section should cite the daily log with specific claims extracted
 """
    cost = 0.0
 async def _invoke_llm(prompt: str) -> tuple[float, bool]:
    """Run one LLM compile pass. Returns (cost_usd, success).
    success=False means the SDK raised an exception — the caller must NOT
    mark the daily log as compiled in state.json, so the log is retried on
    the next run rather than silently dropped.
    """
    from claude_agent_sdk import (
        AssistantMessage,
        ClaudeAgentOptions,
        ResultMessage,
        TextBlock,
        query,
    )
    cost = 0.0
    try:
        async for message in query(
            prompt=prompt,
            options=ClaudeAgentOptions(
                cwd=str(ROOT_DIR),
                model=COMPILE_MODEL,
                system_prompt={"type": "preset", "preset": "claude_code"},
                allowed_tools=["Read", "Write", "Edit", "Glob", "Grep"],
                permission_mode="acceptEdits",
@ -142,25 +205,63 @@ Read the daily log above and compile it into wiki articles following the schema
            if isinstance(message, AssistantMessage):
                for block in message.content:
                    if isinstance(block, TextBlock):
-                        pass  # compilation output - LLM writes files directly
+                        pass  # LLM writes files directly via tools
            elif isinstance(message, ResultMessage):
                cost = message.total_cost_usd or 0.0
                print(f"    Cost: ${cost:.4f}")
        return cost, True
    except Exception as e:
-        print(f"  Error: {e}")
+        print(f"    SDK error: {e}")
-        return 0.0
+        return cost, False
-    # Update state
+
 async def compile_daily_log(log_path: Path, state: dict) -> float:
    """Compile a single daily log into knowledge articles.
    Splits large logs into `### `-bounded chunks before invoking the LLM,
    so a single call never receives an oversized daily log. State is only
    updated when ALL chunks succeed — partial failure leaves the log
    flagged as uncompiled so the next run retries it.
    Returns total API cost of the compilation (sum across chunks).
    """
    log_content = log_path.read_text(encoding="utf-8")
    chunks = _split_log_into_chunks(log_content, MAX_LOG_CHARS_PER_CHUNK)
    total_cost = 0.0
    all_succeeded = True
    for i, chunk in enumerate(chunks, 1):
        chunk_info = (
            f"\n\n(Chunk {i} of {len(chunks)} — compile the sections in this chunk; "
            "remaining chunks of the same log follow in subsequent calls.)"
            if len(chunks) > 1
            else ""
        )
        prompt = _build_prompt(log_path.name, chunk, chunk_info=chunk_info)
        print(f"  Chunk {i}/{len(chunks)} ({len(chunk):,} chars)...")
        cost, ok = await _invoke_llm(prompt)
        total_cost += cost
        if not ok:
            all_succeeded = False
            break
    if not all_succeeded:
        print(f"  FAILED: log not marked compiled; will retry on next run.")
        return total_cost
    # All chunks succeeded — atomically update state.
    rel_path = log_path.name
    state.setdefault("ingested", {})[rel_path] = {
        "hash": file_hash(log_path),
        "compiled_at": now_iso(),
-        "cost_usd": cost,
+        "cost_usd": total_cost,
        "chunks": len(chunks),
    }
-    state["total_cost"] = state.get("total_cost", 0.0) + cost
+    state["total_cost"] = state.get("total_cost", 0.0) + total_cost
    save_state(state)
-    return cost
+    return total_cost
 def main():
@ -207,13 +308,18 @@ def main():
    if args.dry_run:
        return
-    # Compile each file sequentially
+    async def _compile_all() -> float:
-    total_cost = 0.0
+        total = 0.0
        for i, log_path in enumerate(to_compile, 1):
            print(f"\n[{i}/{len(to_compile)}] Compiling {log_path.name}...")
-        cost = asyncio.run(compile_daily_log(log_path, state))
+            cost = await compile_daily_log(log_path, state)
-        total_cost += cost
+            total += cost
            print(f"  Done.")
        return total
    # Single event-loop lifecycle for the whole batch — avoids reinit overhead
    # and lets any async resources in the SDK settle predictably.
    total_cost = asyncio.run(_compile_all())
    articles = list_wiki_articles()
    print(f"\nCompilation complete. Total cost: ${total_cost:.2f}")
--- a/scripts/lint.py
+++ b/scripts/lint.py
@ -13,9 +13,15 @@ from __future__ import annotations
 import argparse
 import asyncio
 import os
 from pathlib import Path
 from config import KNOWLEDGE_DIR, REPORTS_DIR, now_iso, today_iso
 # Contradiction-check model. Kept as Sonnet for reasoning quality; override
 # via MEMORIA_LINT_MODEL (e.g. to use a cheaper model for structural runs
 # that happen to include the LLM check).
 LINT_MODEL = os.environ.get("MEMORIA_LINT_MODEL", "sonnet")
 from utils import (
    count_inbound_links,
    extract_wikilinks,
@ -105,20 +111,36 @@ def check_stale_articles() -> list[dict]:
 def check_missing_backlinks() -> list[dict]:
-    """Check for asymmetric links: A links to B but B doesn't link to A."""
+    """Check for asymmetric links: A links to B but B doesn't link to A.
    Skips any source or target under `qa/` or `sources/`: Q&A articles
    intentionally reference concepts without requiring a reciprocal link
    (concepts would otherwise accumulate a backlink per question, which
    drowns real relationships). Also handles pipe-aliased wikilinks via
    the alias-aware extract_wikilinks helper.
    """
    issues = []
    # Aliased/nested variants of the source link that should count as a
    # valid backlink on the target side: bare slug, and pipe-aliased form.
    for article in list_wiki_articles():
        content = article.read_text(encoding="utf-8")
        rel = article.relative_to(KNOWLEDGE_DIR)
-        source_link = str(rel).replace(".md", "").replace("\\", "/")
+        rel_str = str(rel).replace("\\", "/")
        # Skip one-way source categories.
        if rel_str.startswith("qa/") or rel_str.startswith("sources/"):
            continue
        source_link = rel_str.replace(".md", "")
        for link in extract_wikilinks(content):
-            if link.startswith("daily/"):
+            if link.startswith("daily/") or link.startswith("qa/") or link.startswith("sources/"):
                continue
            target_path = KNOWLEDGE_DIR / f"{link}.md"
            if target_path.exists():
                target_content = target_path.read_text(encoding="utf-8")
-                if f"[[{source_link}]]" not in target_content:
+                target_backlinks = extract_wikilinks(target_content)
                if source_link not in target_backlinks:
                    issues.append({
                        "severity": "suggestion",
                        "check": "missing_backlink",
@ -185,6 +207,7 @@ Do NOT output anything else - no preamble, no explanation, just the formatted li
            prompt=prompt,
            options=ClaudeAgentOptions(
                cwd=str(ROOT_DIR),
                model=LINT_MODEL,
                allowed_tools=[],
                max_turns=2,
            ),
--- a/scripts/query.py
+++ b/scripts/query.py
@ -14,16 +14,28 @@ from __future__ import annotations
 import argparse
 import asyncio
 import os
 from pathlib import Path
 from config import KNOWLEDGE_DIR, QA_DIR, now_iso
-from utils import load_state, read_all_wiki_content, save_state
+from utils import load_state, read_wiki_index, save_state
 ROOT_DIR = Path(__file__).resolve().parent.parent
 # Query model (Sonnet by default — synthesis over the retrieved articles
 # benefits from strong reasoning; override via MEMORIA_QUERY_MODEL).
 QUERY_MODEL = os.environ.get("MEMORIA_QUERY_MODEL", "sonnet")
 async def run_query(question: str, file_back: bool = False) -> str:
-    """Query the knowledge base and optionally file the answer back."""
+    """Query the knowledge base and optionally file the answer back.
    Unlike upstream, we do NOT inline the entire wiki into the prompt — the
    LLM receives the index only and uses its Read tool to fetch articles
    it decides are relevant. Keeps prompt size bounded regardless of
    knowledge-base size and avoids the whole-wiki-in-prompt cost wall
    documented in upstream issues #3/#5/#9.
    """
    from claude_agent_sdk import (
        AssistantMessage,
        ClaudeAgentOptions,
@ -32,7 +44,7 @@ async def run_query(question: str, file_back: bool = False) -> str:
        query,
    )
-    wiki_content = read_all_wiki_content()
+    wiki_index = read_wiki_index()
    tools = ["Read", "Glob", "Grep"]
    if file_back:
@ -59,20 +71,23 @@ After answering, do the following:
 """
    prompt = f"""You are a knowledge base query engine. Answer the user's question by
-consulting the knowledge base below.
+consulting the knowledge base.
 ## How to Answer
 1. Read the INDEX section first - it lists every article with a one-line summary
 2. Identify 3-10 articles that are relevant to the question
-3. Read those articles carefully (they're included below)
+3. Use the Read tool to fetch those articles (they live at
   {KNOWLEDGE_DIR}/concepts/, {KNOWLEDGE_DIR}/connections/, and
   {KNOWLEDGE_DIR}/qa/). Only read articles you actually need — do not
   read the entire wiki.
 4. Synthesize a clear, thorough answer
 5. Cite your sources using [[wikilinks]] (e.g., [[concepts/supabase-auth]])
 6. If the knowledge base doesn't contain relevant information, say so honestly
-## Knowledge Base
+## Knowledge Base Index
-{wiki_content}
+{wiki_index}
 ## Question
@ -87,6 +102,7 @@ consulting the knowledge base below.
            prompt=prompt,
            options=ClaudeAgentOptions(
                cwd=str(ROOT_DIR),
                model=QUERY_MODEL,
                system_prompt={"type": "preset", "preset": "claude_code"},
                allowed_tools=tools,
                permission_mode="acceptEdits",