Skip to content

Memory Priority Enforcement

Intermediate

Pattern for structuring agent memory into always-load (critical) and on-demand (reference) tiers, with mechanical enforcement for the most critical rules. Convergence of 8 independent domains onto the same principle.

Core Principle

If forgetting a rule causes irreversible consequences → always-load.
Everything else → on-demand.

This is not a performance optimization. It is a safety property: critical context must survive context compaction, long sessions, and attention degradation.

Always vs On-Demand Criteria

Always-load (qualify for any one): 1. Irreversible consequences — forgetting causes data loss, financial damage, production downtime, security breach 2. Cross-session invariants — rules that must not be violated regardless of task ("never delete without confirmation", "never use production GPU without checking allocation") 3. User profile / preferences — communication style, expertise level, tool preferences

On-demand (load when topic becomes relevant): - Active project details (loaded when working on that project) - Reference materials and methodologies - Historical information and past incidents - Specific tool integrations not in daily use

Boeing Rule — Keep Always-Load Short

Aviation "memory items" (emergency procedures done from memory, no checklist): - Typical: 3-4 items per emergency procedure - Maximum: ~10 items (considered high) - Above 10: checklist fatigue → critical items forgotten under stress

For agent memory: Always-load section should be 10-15 entries maximum. Above this, important items fall into the "lost in the middle" zone.

Lost in the Middle (Liu et al. 2023, arxiv 2307.03172): LLM performance follows a U-shape based on context position. Items at the beginning or end receive highest attention. Items in the middle degrade significantly. With 36 always-load entries, critical rules rank 5-35 in a long context — exactly the degraded zone.

IAEA Two-Layer Principle

From IAEA INSAG-10 defense-in-depth: critical safety barriers must be independent and redundant. Relying on a single layer (rule) to enforce critical behavior fails under common-cause failure (context pressure, compaction, distraction).

Applied: - Rule in CLAUDE.md/memory (describes WHY and HOW) - Hook/script mechanical enforcement (executes regardless of context state)

Examples of working pairs:

Documentation integrity:
  Rule → "validate file references"
  Hook → SessionStart → validate_config.py (runs every session, blocks on drift)

Handoff reminder:
  Rule → "write handoff before ending long sessions"
  Hook → Stop → remind_handoff.py (blocks session end if no recent handoff)

Supply chain defense:
  Rule → "delay fresh packages"
  Enforcement → .npmrc min-release-age=7 (mechanically applied to every install)

Check when adding a new critical rule: "Can this rule be bypassed by forgetting to read memory?"
If yes → it needs a hook, not just a rule.

MemGPT Two-Tier Architecture

Core memory    ← in-context, fixed-size, agent can read/write via tool calls
               ← User profile, current task state, active invariants

Archival memory ← searchable database, paged in on-demand
                ← Historical context, past projects, reference materials

Key property: Agent explicitly decides what to page in from archival to core. This keeps core small and focused while preserving full history.

For file-based systems:
MEMORY.md Always Load section = core memory
MEMORY.md On Demand section = archival index
Actual on-demand files = archival storage

Domain Convergence Table

Domain Critical / Always Reference / On-demand Enforcement mechanism
LLM context (Liu 2023) Beginning / end of context Middle (attention degrades) Position in prompt
MemGPT Core memory (fixed) Archival (paged) Tool calls for page-in
Aviation Memory items (3-10) Checklists (documented) Training + repetition
Neuroscience Procedural memory (striatum) Episodic (hippocampus) Neural architecture
Cognitive psych Working memory (4±1 chunks) Long-term (unlimited) Attention / rehearsal
Nuclear safety (INSAG-10) 5-layer defense barriers Procedures / manuals Physical redundancy
Linux syslog EMERG/ALERT → console DEBUG → ring buffer console_loglevel
Kubernetes Admission webhook policies API server logs Webhook interception

All 8 domains converge on: small critical set + large reference set + mechanical enforcement for the critical set.

Memory Staleness Management

Memory records should include a created: YYYY-MM-DD field. Staleness by type:

Type Stale after Check on use
project 30 days Verify before use if older
reference (IPs, paths, ports) 7 days Always verify if infrastructure-related
feedback Stable (user preferences rarely change) Low priority
user Very stable Only check if major context shift suspected

Automatic Memory Extraction Pattern

After sessions, extract to memory: 1. Capabilities — new workflows, tools, integrations discovered 2. Corrections — user-corrected approaches (→ feedback memory) 3. Permissions — newly allowed commands (→ settings) 4. Bug/fix pairs — symptom → root cause → solution (→ gotchas) 5. Decisions — architectural choices with rationale (→ DECISIONS.md or memory)

Tagged for triage: [DECISION], [GOTCHA], [REUSE], [DEFER]

Gotchas

  • Issue: Always-load section grows unbounded as new rules are added without removing old ones. -> Fix: Apply Boeing rule at review time: if section exceeds 15 entries, conduct a triage pass using irreversible-consequence criterion. Most "nice to remember" items belong in on-demand.
  • Issue: Rule in memory prevents an action at session start but is forgotten after context compaction mid-session. -> Fix: For any rule that must hold for an entire session (e.g., "don't modify production"), create a hook that checks at every relevant tool call — not just at session start.
  • Issue: On-demand memory becomes a flat unstructured dump, making retrieval unreliable. -> Fix: Use wiki-links [[filename]] to create a navigable graph. Memory about project X should link to the infrastructure it runs on, the tools it uses, and the methodology applied.
  • Issue: Memory records created years ago reference file paths, server IPs, or ports that no longer exist. -> Fix: SessionStart hook (e.g., validate_config.py) scans memory for absolute paths and checks existence. Reports drift before the session acts on stale data.

See Also