Entity Memory
Long-term entity-graph memory system that persists knowledge across sessions using SQLite and FTS5.
Entity Memory
Supyagent includes a built-in entity-graph memory system that extracts structured knowledge from conversations and persists it across sessions. This gives agents the ability to remember facts about people, projects, preferences, and relationships over time.
How It Works
The memory system operates in three phases:
- Signal Detection -- During conversation, the system watches for messages that likely contain extractable information (preference statements, decisions, names, long messages, tool calls).
- Extraction -- When enough signal-flagged exchanges accumulate (controlled by
extraction_threshold), the LLM extracts entities, relationships, and episode summaries into structured JSON. - Retrieval -- On each new user message, the system searches the memory database for relevant entities and facts, then injects them into the system prompt as context.
Storage Architecture
Memory is stored in a SQLite database with FTS5 (Full-Text Search) at ~/.supyagent/memory/{agent_name}/memory.db. The schema contains:
| Table | Purpose |
|---|---|
entities | Named entities (people, projects, technologies, etc.) with type, summary, and properties |
edges | Relationships between entities with confidence scores and temporal validity |
episodes | Session-level summaries with observations and outcomes |
entity_types | Ontology of entity types (seeded + LLM-proposed) |
edge_types | Ontology of relationship types (seeded + LLM-proposed) |
entities_fts | FTS5 index over entity names and summaries |
edges_fts | FTS5 index over relationship facts |
Seed Ontology
The system comes pre-seeded with common entity types and relationship types:
Entity Types: Person, Organization, Project, Technology, Concept, Location, Artifact, Event
Relationship Types: prefers, works_on, depends_on, created, relates_to, occurred_at, caused_by, resolved_by, knows, part_of
The LLM can propose new types during extraction. These are tracked with a source field (seed vs llm_proposed) and a usage_count for monitoring.
Configuration
Enable and configure memory in your agent YAML under the memory key:
name: myagent
model:
provider: anthropic/claude-sonnet-4-5-20250929
memory:
enabled: true # Enable entity-graph memory
extraction_threshold: 5 # Extract after N signal-flagged exchanges
retrieval_limit: 10 # Max memories injected per turn
auto_extract: true # Automatically extract from conversation
system_prompt: |
You are a helpful assistant with long-term memory.Configuration Fields
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable entity-graph memory across sessions |
extraction_threshold | int | 5 | Number of signal-flagged exchanges before triggering extraction |
retrieval_limit | int | 10 | Maximum number of memories to inject into context per turn |
auto_extract | bool | true | Automatically extract memories from conversation |
Signal Detection
Not every message warrants memory extraction. The system uses a lightweight heuristic to detect "signal" messages that likely contain extractable information. A message is flagged if it:
- Contains keywords like "I prefer", "I use", "remember", "we decided", "my name is", "important", "always", "never"
- Is longer than 500 characters (long messages often contain rich context)
- Contains tool calls (which suggest actionable context)
When enough signal messages accumulate (reaching the extraction_threshold), extraction is triggered.
Memory Injection
On each turn, the memory system searches for entities and facts relevant to the user's current message. The results are formatted and injected into the system prompt:
[Agent Memory]
Known entities: Alice (Person), Project Alpha (Project), React (Technology)
Key facts:
- Alice prefers TypeScript over JavaScript
- Alice works on Project Alpha
- Project Alpha depends on React
Recent episodes:
- Discussed migration plan for Project Alpha from JavaScript to TypeScriptThis context appears before the conversation messages, giving the agent access to accumulated knowledge without consuming the full conversation history.
Extraction Pipeline
When extraction triggers, the system:
- Formats pending messages as a conversation transcript
- Retrieves the current ontology (known types) for context
- Sends an extraction prompt to the LLM requesting structured JSON
- Parses the response into entities, relationships, and an episode summary
- Resolves entities against existing records (deduplication by name)
- Stores new/updated entities, edges, and episodes in SQLite
- Updates FTS5 indexes for future search
Entity Resolution
When the LLM extracts an entity named "Alice", the system first checks if an entity with that name already exists (case-insensitive). If found, the existing entity is updated with new information. If not found, a new entity is created. This prevents duplicate entries across sessions.
Temporal Validity
Relationships (edges) support temporal validity through valid_from and valid_until fields. When a fact becomes outdated, the edge is invalidated by setting valid_until rather than being deleted. This preserves the history of knowledge changes.
Daemon Agents and Memory
Memory is particularly useful for daemon agents that process events over time. A daemon can accumulate knowledge about recurring patterns, user preferences, and project states:
name: inbox-monitor
type: daemon
model:
provider: anthropic/claude-sonnet-4-5-20250929
memory:
enabled: true
extraction_threshold: 3 # Extract more frequently for event processing
schedule:
interval: 5m
max_events_per_cycle: 10
system_prompt: |
You monitor inbox events and take appropriate action.
Use your memory to track patterns and important context.Viewing Memory Stats
You can inspect the memory system's state via the ontology stats:
from supyagent.core.memory import MemoryManager
from supyagent.core.llm import LLMClient
llm = LLMClient("anthropic/claude-sonnet-4-5-20250929")
memory = MemoryManager("myagent", llm)
stats = memory.get_ontology_stats()
print(f"Entities: {stats['entity_count']}")
print(f"Active edges: {stats['edge_count']}")
print(f"Episodes: {stats['episode_count']}")Related
- Context Management -- How summaries and memory work together
- Configuration -- Global and per-agent configuration layers
- Telemetry -- Tracking memory extraction calls and costs