Memory System¶

Morphix uses a three-layer self-healing memory system with FAISS-based semantic search, LLM-driven quality critique, and a background daemon for continuous maintenance.

Architecture Overview¶

MemoryManager (singleton)
    ├── FAISS IndexFlatL2 (1024-dim vectors)
    ├── EmbeddingProvider (SentenceTransformers)
    ├── Per-workspace isolation (memory/{workspace}/)
    └── autoDream daemon (self_healing_check)

MemoryManager¶

File: core/memory/manager.py

Singleton that manages all memory operations — store, search, update, delete — with quality validation and automatic correction.

Core API¶

# Write (with quality validation)
await memory.write("key", value, validated=False, content_hint="analytical")

# Read by key
value = memory.read("key")

# Semantic search (FAISS)
results = memory.search("query", k=5, min_similarity=0.0)
# Returns: [{"key": "...", "value": ..., "distance": 0.34, "similarity": 0.7463}, ...]

# Write to system-global space (not workspace-scoped)
await memory.write_system("key", value)

# User profile management
profile = memory.get_user_profile()
await memory.update_user_profile({"name": "Alice", "country": "ES"})

# Correction tracking
await memory.save_user_correction(original_task, correction)

Workspace isolation¶

Memory is isolated per workspace. Each workspace has its own subdirectory under memory/:

memory/
├── main/           # Default workspace
│   ├── user_profile.md
│   ├── last_creative_output.md
│   └── ...
├── system/         # Global (cross-workspace)
│   └── ...
└── myworkspace/    # Custom workspace
    └── ...

Switching workspaces:

await memory.switch_workspace("myworkspace")

On switch, the FAISS index and document list are atomically swapped. Embedding computation runs in a thread pool to avoid blocking the event loop.

Quality validation pipeline¶

Every write() that is not pre-validated goes through LLM critique:

async def write(self, key, value, validated=False, content_hint=None) -> bool:
    if not validated:
        critique = await self._llm_critique(key, value, content_hint)
        score = critique["quality_score"]  # 0-100

        if score < threshold:
            return False  # Rejected

        if critique.get("suggested_fix"):
            value = critique["suggested_fix"]  # Auto-corrected

    # Compute embedding → persist to disk → add to FAISS index

Quality thresholds vary by content type:

Content type	Threshold
`user_profile_last_update`	15
`workflow_subtask_*`	20
`creative` hint	30
`analytical` hint	50
Default	40

Write rollback¶

Writes use a transactional pattern: if the file write or index update fails after removing the old entry, the old entry is restored (both in the index and on disk).

Protected keys¶

Certain keys are protected from automatic modification/deletion:

_PROTECTED_EXACT = {
    "kairos_daemon_heartbeat",
    "user_profile",
    "user_profile_last_update",
    "security_private",
    "last_creative_output",
    "last_analysis",
    "last_plan",
    "last_connection",
    "last_successful_code",
}
_PROTECTED_PREFIXES = ("workflow_subtask_", "last_", "merged_")

FAISS Indexer¶

File: core/faiss_indexer.py

Reusable FAISS indexer with save/load support.

FAISS_DIMENSION = 1024  # Matches multilingual-e5-large output dimension

class FAISSIndexer:
    def __init__(self, dimension=1024, embedder=None)
    def add(key: str, value: object) -> None
    def search(query: str, k: int = 5) -> list[dict]
    def remove(key: str) -> None
    def rebuild_index() -> None
    def clear() -> None
    def save(directory: Path) -> None       # Persist to faiss.index + documents.pkl
    def load(directory: Path) -> FAISSIndexer  # Restore from disk

Embedding computation delegates to EmbeddingProvider. The index uses IndexFlatL2 (exact L2 search — no approximation, suitable for small-to-medium document counts).

Embedding Provider¶

File: core/embedding_provider.py

Lazy-loading embedding provider using SentenceTransformers.

class EmbeddingProvider:
    _model_name = "intfloat/multilingual-e5-large"  # 1024-dim

    @classmethod
    def get_instance(cls):    # Returns model or None if not yet loaded
    @classmethod
    def encode(cls, text):     # Encode text (returns None if model not ready)
    @classmethod
    def wait_until_ready(cls, timeout=60) -> bool:  # Block until model loaded

The model is loaded in a background daemon thread — the application starts immediately, and memory operations gracefully degrade until the model is ready.

Model choice

multilingual-e5-large was chosen for its strong multilingual support (English + Spanish), 1024-dimensional embeddings, and permissive license.

autoDream Daemon — Self-Healing¶

File: core/memory/manager.py:538

The MemoryManager.self_healing_check() method implements an autonomous memory maintenance daemon. It runs periodically when DAEMON_MODE=true.

Configuration¶

Setting	Default	Description
`SELF_HEAL_INTERVAL`	120s	Time between self-healing cycles
`DAEMON_MODE`	`false`	Enables the background daemon task
`DAEMON_MODE=false` (CI)	—	Skips daemon entirely

Self-healing phases¶

The daemon runs four sequential phases:

Phase 1: Quality critique¶

Checks the 20 most recent documents. Any with quality_score < 60 are: - Auto-corrected if the LLM suggests a fix - Deleted if no fix can be generated

Protected keys (see above) are skipped.

Phase 2: Duplicate detection¶

# core/memory/manager.py:324 — _detect_duplicates

Uses FAISS similarity search to find document pairs with similarity > 0.92. When found: 1. Both documents are sent through LLM critique 2. The higher-quality document is kept 3. The lower-quality document is deleted from disk and index

Phase 3: Contradiction resolution¶

# core/memory/manager.py:392 — _resolve_contradictions

Detects document pairs with similarity 0.65–0.92 (similar enough to be related, different enough to potentially contradict). These are sent to an LLM arbitrator:

async def _arbitrate_contradiction(key_a, val_a, key_b, val_b) -> str | None:
    """Ask LLM: 'Do these contradict? If so, produce a single consolidated fact.'"""

If the LLM detects a contradiction, both originals are deleted and replaced with a merged_ document. If no contradiction is found (LLM returns SKIP), both documents are preserved.

Phase 4: Stale pruning¶

# core/memory/manager.py:487 — _prune_stale

Removes documents not accessed in 30+ days (skipping protected keys). The _access_log dictionary tracks last access timestamps, updated on every read() and search() call.

Index rebuild¶

After any phase that modifies the document set, the FAISS index is rebuilt atomically:

Document snapshot taken under lock
Embeddings pre-computed outside lock (slow operation)
New IndexFlatL2 created and populated under lock

Integration with Conversations¶

Memory integrates with the conversation system in two ways:

Context augmentation: Before sending messages to the LLM, get_user_summary() and get_long_context_summary() extract relevant memory facts and inject them into the prompt context.
Security logging: Blocked distillation attempts are written to memory key "security_private" via undercover._block_attempt() — creating a persistent audit trail that survives application restarts.

Directory Structure¶

memory/
├── system/                    # Global, cross-workspace
│   └── kairos_daemon_heartbeat.md
├── main/                      # Default workspace
│   ├── user_profile.md
│   ├── last_creative_output.md
│   ├── security_private.md
│   └── ...
├── myproject/                 # Custom workspace
│   └── ...
└── ...