This is the initial fork commit for agent-admin/memoria, a production
hardening of coleam00/claude-memory-compiler. It addresses all four P0
findings from the bug audit (atomic state writes, file locking on
daily log appends, subprocess detachment, path-traversal guard) plus
several P1s (aliased wikilinks, timezone wiring, staleness-based
compile trigger, SDK retry with backoff, file-handle context manager).
File-level changes:
- LICENSE — MIT (fork is self-declared FOSS; upstream has no LICENSE
file but author has stated FOSS intent).
- pyproject.toml — renamed project to `memoria`, removed unused
python-dotenv dependency, added optional `test` dep group.
- scripts/fs_utils.py — NEW module containing the primitives that
the other patches rely on:
* atomic_write_text(path, content): tmp + fsync + os.replace;
interrupted writes leave the target unchanged.
* locked_append_text(path, content): fcntl.flock (POSIX) /
msvcrt.locking (Windows) exclusive lock around the write so
concurrent callers never interleave.
* extract_wikilinks / parse_wikilink: strip [[target|display]]
aliases correctly (fixes upstream issues #7 and #8).
* safe_article_path(link, base): resolves a wikilink slug inside
a base dir or returns None (path traversal guard).
* load_json_with_recovery(path, default): on corruption, moves
the bad file aside with a timestamped .bak-YYYYMMDDTHHMMSSZ
suffix, logs a warning, returns the default. Replaces the
silent `{}` return that would otherwise cause full-recompile.
- scripts/utils.py — save_state/load_state now use atomic writes and
corruption recovery; wiki_article_exists + count_inbound_links now
alias-aware via fs_utils helpers.
- scripts/config.py — TIMEZONE is now wired via zoneinfo.ZoneInfo
and used by now_iso/today_iso (previously defined but ignored).
Overridable via MEMORIA_TZ env var. Unknown zones log a warning
and fall back to system local time rather than crashing.
- scripts/flush.py —
* save_flush_state / load_flush_state use atomic + recovery.
* append_to_daily_log uses locked_append_text; concurrent flush
+ pre-compact calls can no longer interleave log entries.
* run_flush retries SDK failures up to MAX_SDK_ATTEMPTS=3 with
exponential backoff (2s, 4s) before returning FLUSH_ERROR.
* On FLUSH_ERROR, main() preserves the context file and does NOT
update dedup state — the next flush retries cleanly instead of
the failure being silently swallowed.
* Explicit model="haiku" for flush (short summarization task).
* maybe_trigger_compilation replaced: 6 PM wall-clock gate is
gone; trigger is now staleness-based (hash changed AND
COMPILE_INTERVAL_MIN elapsed since last compile). Configurable
via MEMORIA_COMPILE_INTERVAL_MIN. Uses _now_local() from
config so the clock respects the configured timezone.
* compile.log handle uses a `with open()` context manager so the
fd is always cleaned up, even if Popen throws.
- hooks/session-end.py, hooks/pre-compact.py — subprocess.Popen now
passes start_new_session=True on POSIX, detaching flush.py from
the hook's process group so it survives post-hook SIGHUP. Fixes
the intermittent-data-loss failure mode where flush subprocess
was killed mid-LLM-call.
Tests (formal acceptance suite still to come in this phase): each
helper verified via unit exercise in scratch directories — atomic
roundtrip, corruption recovery with .bak creation, alias parsing,
path-traversal rejection.
Upstream issue mapping: #3/#5/#9 addressed by the next commit
(compile.py + query.py scaling fix). #7/#8 addressed here via
alias-aware helpers. License (#11) resolved via MIT LICENSE.
55 lines
2.2 KiB
Python
55 lines
2.2 KiB
Python
"""Path constants and configuration for the personal knowledge base."""
|
|
|
|
import os
|
|
from pathlib import Path
|
|
from datetime import datetime, timezone
|
|
from zoneinfo import ZoneInfo, ZoneInfoNotFoundError
|
|
|
|
# ── Paths ──────────────────────────────────────────────────────────────
|
|
ROOT_DIR = Path(__file__).resolve().parent.parent
|
|
DAILY_DIR = ROOT_DIR / "daily"
|
|
KNOWLEDGE_DIR = ROOT_DIR / "knowledge"
|
|
CONCEPTS_DIR = KNOWLEDGE_DIR / "concepts"
|
|
CONNECTIONS_DIR = KNOWLEDGE_DIR / "connections"
|
|
QA_DIR = KNOWLEDGE_DIR / "qa"
|
|
REPORTS_DIR = ROOT_DIR / "reports"
|
|
SCRIPTS_DIR = ROOT_DIR / "scripts"
|
|
HOOKS_DIR = ROOT_DIR / "hooks"
|
|
AGENTS_FILE = ROOT_DIR / "AGENTS.md"
|
|
|
|
INDEX_FILE = KNOWLEDGE_DIR / "index.md"
|
|
LOG_FILE = KNOWLEDGE_DIR / "log.md"
|
|
STATE_FILE = SCRIPTS_DIR / "state.json"
|
|
|
|
# ── Timezone ───────────────────────────────────────────────────────────
|
|
# Configurable via the MEMORIA_TZ environment variable (falls back to
|
|
# America/Chicago to preserve upstream's default for users who don't set it).
|
|
# If the zone name is unknown (missing tzdata, typo), log a warning and fall
|
|
# back to the system local timezone via astimezone() with no argument.
|
|
TIMEZONE = os.environ.get("MEMORIA_TZ", "America/Chicago")
|
|
|
|
try:
|
|
TZ: ZoneInfo | None = ZoneInfo(TIMEZONE)
|
|
except ZoneInfoNotFoundError:
|
|
import logging
|
|
logging.getLogger(__name__).warning(
|
|
"Timezone %r not found; falling back to system local time.", TIMEZONE
|
|
)
|
|
TZ = None
|
|
|
|
|
|
def _now_local() -> datetime:
|
|
"""Current datetime in the configured TIMEZONE (or system local as fallback)."""
|
|
if TZ is not None:
|
|
return datetime.now(TZ)
|
|
return datetime.now(timezone.utc).astimezone()
|
|
|
|
|
|
def now_iso() -> str:
|
|
"""Current time in ISO 8601 format, in the configured TIMEZONE."""
|
|
return _now_local().isoformat(timespec="seconds")
|
|
|
|
|
|
def today_iso() -> str:
|
|
"""Current date in ISO 8601 format, in the configured TIMEZONE."""
|
|
return _now_local().strftime("%Y-%m-%d")
|