Memory System
FERAL’s memory is a four-tier architecture stored in a single SQLite database (~/.feral/memory.db). Each tier serves a different retention and retrieval pattern. On top of the tiers sit hybrid search, diversity reranking, session compaction, wiki compilation, and P2P sync.
Four Memory Tiers
Working Memory
In-RAM context for the current session. Holds the conversation history, tool results, and scratch state. Cleared when the session ends.Episodic Memory
Auto-generated summaries of past conversations. Each episode captures the key facts, decisions, and outcomes from a session.Semantic Memory / Knowledge Graph
Persistent facts stored as subject-predicate-object triples. Extracted automatically from conversations or added explicitly via “remember X” commands.Execution Log
An append-only log of every tool invocation, including arguments, results, latency, and success/failure status.Hybrid Search
Memory retrieval combines SQLite FTS5 (keyword) and vector similarity (semantic) to get the best of both worlds.- FTS5 returns top-N by BM25 score, normalized to
[0, 1]. - Vector search returns top-N by cosine similarity, already in
[0, 1]. - Scores are combined:
final = alpha * vector_score + (1 - alpha) * fts_score. - Results are merged and deduplicated by ID.
all-MiniLM-L6-v2 (384 dimensions) by default, computed locally via sentence-transformers. For larger deployments, swap in OpenAI text-embedding-3-small via config.
MMR Diversity Reranking
After hybrid search, Maximal Marginal Relevance reranks results to reduce redundancy. Without MMR, the top-5 results might all describe the same event from different angles.lambda * relevance - (1 - lambda) * max_similarity_to_already_selected.
Session Compaction
When working memory exceeds its token budget, the compactor summarizes older messages into an episode and evicts them from the active context.- Select messages beyond the budget.
- Prompt the LLM to summarize them into a structured episode.
- Insert the episode into
episodestable with embedding. - Replace the compacted messages with a system note:
[Session compacted — N messages summarized].
Wiki Compilation
The Memory Wiki compiles episodes, notes, and knowledge graph entries into durable, human-readable wiki pages organized by topic.~/.feral/wiki/ as Markdown files with YAML frontmatter tracking provenance:
P2P Sync
For multi-device setups (laptop + phone + home server), FERAL supports peer-to-peer memory synchronization over the/sync WebSocket endpoint.
API Reference
| Endpoint | Method | Description |
|---|---|---|
/api/memory/search | POST | Hybrid search across all tiers |
/api/memory/remember | POST | Store a fact in the knowledge graph |
/api/memory/episodes | GET | List recent episodes |
/api/memory/wiki | GET | List wiki pages |
/api/memory/wiki/{topic} | GET | Read a wiki page |
/api/memory/stats | GET | Memory size, tier counts, index health |
/sync | WebSocket | P2P memory sync between nodes |
