fork: scaling fixes (index-only context + chunking + model wiring)
Fixes upstream issues #3/#5/#9 (whole-wiki in every prompt) and adds large-log chunking. Addresses the audit's P1 scaling findings (C1), the chunking requirement operator added on top, C8 explicit model wiring across all LLM call sites, and D3 single-event-loop refactor. ## compile.py - **Index-only context.** The `existing_articles_context` concatenation of every wiki article has been removed from the prompt. Instead the LLM receives only the index + schema + daily log and uses the Read tool (already in allowed_tools) to fetch specific articles it decides are relevant. Prompt size stays bounded regardless of KB growth — upstream's 250K-token prompts past ~100 articles are gone. - **Chunking.** `_split_log_into_chunks()` splits oversized daily logs along `### ` section boundaries. Threshold MAX_LOG_CHARS_PER_CHUNK (default 100K chars ≈ 25K tokens, configurable via MEMORIA_MAX_LOG_CHARS). Chunks compile via separate LLM calls that naturally merge through Edit on shared files. Oversized single sections emit as their own chunks rather than splitting mid-thought. - **Atomic state on chunked compile.** State is only written after ALL chunks succeed — partial-failure leaves the log flagged as uncompiled in state.json so the next run retries it cleanly. Was already correct for single-chunk logs (early return on SDK error) and now correct for multi-chunk too. - **Explicit model.** `model=COMPILE_MODEL` passed to ClaudeAgentOptions. Default "sonnet"; override via MEMORIA_COMPILE_MODEL env var. - **D3: single asyncio.run.** The per-file `asyncio.run()` in the compile loop is replaced with one outer call wrapping `_compile_all`. Avoids repeated event-loop setup/teardown and matches the pattern used for async resources in the SDK. ## query.py - **Index-only context.** `read_all_wiki_content()` replaced with `read_wiki_index()`. The LLM reads the index and uses its Read tool to fetch specific articles. Same rationale as compile.py — keeps prompt size bounded and cost predictable. - **Explicit model.** `model=QUERY_MODEL`, default "sonnet", override via MEMORIA_QUERY_MODEL. ## lint.py - **C9: skip qa/sources in missing-backlink check.** Articles under qa/ or sources/ no longer trigger a suggestion that every referenced concept should backlink to them. Concepts aren't expected to link back to every Q&A that mentions them — doing so would drown real relationships. - **Alias-aware backlink detection.** Uses `extract_wikilinks()` to parse the target's link list so `[[concepts/foo|Display]]` forms count as valid backlinks (previously required exact `[[foo]]` match, causing false positives on aliased forms). - **Explicit model.** `model=LINT_MODEL` in check_contradictions call, default "sonnet", override via MEMORIA_LINT_MODEL. ## Verified - Chunking: 120K-char 3-section log splits into 80K + 40K, reconstructs byte-exact. Oversized single section (150K) emits as its own chunk. Small log (<100K) returns as single chunk. - All patched modules import cleanly with expected config values. - compile_daily_log / query.run_query / flush.maybe_trigger_compilation / lint.check_missing_backlinks all callable post-patch.
This commit is contained in:
parent
39ab2a8b6f
commit
03296be47a
3 changed files with 213 additions and 68 deletions
|
|
@ -14,16 +14,28 @@ from __future__ import annotations
|
|||
|
||||
import argparse
|
||||
import asyncio
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
from config import KNOWLEDGE_DIR, QA_DIR, now_iso
|
||||
from utils import load_state, read_all_wiki_content, save_state
|
||||
from utils import load_state, read_wiki_index, save_state
|
||||
|
||||
ROOT_DIR = Path(__file__).resolve().parent.parent
|
||||
|
||||
# Query model (Sonnet by default — synthesis over the retrieved articles
|
||||
# benefits from strong reasoning; override via MEMORIA_QUERY_MODEL).
|
||||
QUERY_MODEL = os.environ.get("MEMORIA_QUERY_MODEL", "sonnet")
|
||||
|
||||
|
||||
async def run_query(question: str, file_back: bool = False) -> str:
|
||||
"""Query the knowledge base and optionally file the answer back."""
|
||||
"""Query the knowledge base and optionally file the answer back.
|
||||
|
||||
Unlike upstream, we do NOT inline the entire wiki into the prompt — the
|
||||
LLM receives the index only and uses its Read tool to fetch articles
|
||||
it decides are relevant. Keeps prompt size bounded regardless of
|
||||
knowledge-base size and avoids the whole-wiki-in-prompt cost wall
|
||||
documented in upstream issues #3/#5/#9.
|
||||
"""
|
||||
from claude_agent_sdk import (
|
||||
AssistantMessage,
|
||||
ClaudeAgentOptions,
|
||||
|
|
@ -32,7 +44,7 @@ async def run_query(question: str, file_back: bool = False) -> str:
|
|||
query,
|
||||
)
|
||||
|
||||
wiki_content = read_all_wiki_content()
|
||||
wiki_index = read_wiki_index()
|
||||
|
||||
tools = ["Read", "Glob", "Grep"]
|
||||
if file_back:
|
||||
|
|
@ -59,20 +71,23 @@ After answering, do the following:
|
|||
"""
|
||||
|
||||
prompt = f"""You are a knowledge base query engine. Answer the user's question by
|
||||
consulting the knowledge base below.
|
||||
consulting the knowledge base.
|
||||
|
||||
## How to Answer
|
||||
|
||||
1. Read the INDEX section first - it lists every article with a one-line summary
|
||||
2. Identify 3-10 articles that are relevant to the question
|
||||
3. Read those articles carefully (they're included below)
|
||||
3. Use the Read tool to fetch those articles (they live at
|
||||
{KNOWLEDGE_DIR}/concepts/, {KNOWLEDGE_DIR}/connections/, and
|
||||
{KNOWLEDGE_DIR}/qa/). Only read articles you actually need — do not
|
||||
read the entire wiki.
|
||||
4. Synthesize a clear, thorough answer
|
||||
5. Cite your sources using [[wikilinks]] (e.g., [[concepts/supabase-auth]])
|
||||
6. If the knowledge base doesn't contain relevant information, say so honestly
|
||||
|
||||
## Knowledge Base
|
||||
## Knowledge Base Index
|
||||
|
||||
{wiki_content}
|
||||
{wiki_index}
|
||||
|
||||
## Question
|
||||
|
||||
|
|
@ -87,6 +102,7 @@ consulting the knowledge base below.
|
|||
prompt=prompt,
|
||||
options=ClaudeAgentOptions(
|
||||
cwd=str(ROOT_DIR),
|
||||
model=QUERY_MODEL,
|
||||
system_prompt={"type": "preset", "preset": "claude_code"},
|
||||
allowed_tools=tools,
|
||||
permission_mode="acceptEdits",
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue