Local-First · Stdlib-Only · Hermes-Validated

Mnemosyne

A Cognitive OS for Local-First AI Agents

A drop-in memory provider for Hermes Agent with a six-tier cognitive memory model, a four-layer identity lock, and offline dream consolidation. Zero runtime dependencies. One pip install away.

0 runtime deps 19 model providers 6-tier ICMS memory 8/8 Hermes checks PASS
scroll ↓
The Premise

The problem isn't that agents don't remember. The problem is that memory only flows in one direction.

Every other system retrieves upward from stored text. Mnemosyne adds the return path — higher-level reflection distilled back down into fast, user-specific instinct. The agent's first response is shaped by what it has learned about you, not just base-model priors.

↑ Retrieval (everyone)

Raw turn — "what happened"
Vector store — embed + retrieve
Summary — compress to abstract
Response — context injected




↓ Reflection → Instinct (Mnemosyne)

L5 identity + L4 patterns
Distilled into L0 reflex cache
Fast-path shapes first token
Agent acts like it knows you
Eval-Gated · Reproducible

Benchmarks, not vibes.

Every number ships with the command that produced it and a regression gate that fails any change which drops a metric. To our knowledge, no other Hermes memory provider publishes eval-gated baselines.

0/8
Hermes runtime checks PASS
Live Hermes v0.16.0 — discovery, routing, persistence, fresh-session recall.
0.0000
Retrieval recall@5
Deterministic probe set — recall@5 / MRR / hit@1 by category.
0.0000
LOCOMO retrieval-only
Across 1,986 questions · snap-research/locomo runner.
0.00
Continuity score
1.00 cross-session · 50 scenarios, 6 categories (v0.7.1).
0.00ms
Per-memory write
INSERT + FTS5 sync · 10,000 writes in 2.13 s.
0.00ms
Search p50 · 10K corpus
FTS5 BM25, 2-token query · p95 18.4 ms.
0.00ms
Wrapper overhead
Just 0.24% at realistic model latency — observability is free.
0/6
Identity slips rewritten
4-layer lock vs. a 40-prompt jailbreak suite.

Reproduce any of these yourself: python3 tests/test_all.py · bash test-harness.sh · mnemosyne-continuity run. Throughput measured single-thread on a Linux sandbox; your hardware will vary. Full methodology in docs/BENCHMARKS.md.

The Hermes Plugin

A drop-in memory provider for Hermes Agent.

Plug the full six-tier ICMS into any Hermes Agent (Nous Research) as its persistent backend. One SQLite file. No API keys, no vector DB, no cloud. Demonstrated, not aspirational — validated end-to-end on a live runtime.

hermes · v0.16.0 · runtime validation

Three tools. Zero ceremony.

Turns persist automatically through a single-writer SQLite queue — the agent loop is never blocked. Session-end hooks run salient extraction, writing facts, preferences, and goals up-tier.

memory_search(query, limit)FTS5 BM25 + strength-weighted retrieval across all tiers
memory_write(content, kind, tier)store fact / preference / goal / pattern at tier 2–5
memory_stats()direct SQLite per-tier row counts
✓ Local SQLite core ✓ ACT-R decay ✓ Hebbian strength ✓ Offline dreams
ICMS · Six-Tier Cognitive Memory

Not a flat fact store. A mind with tiers.

Memories carry tier semantics — working context vs. long-term vs. consolidated patterns vs. human-approved identity — with promotion, decay, and a distilled fast-path reflex cache. Hover a tier.

L0
INSTINCT
Distilled fast-path reflex cache — always checked first
▓▓▓
L1
HOT
Working context — current session, selected slice
▓▓░
L2
WARM
Short-term — default for new writes, feeds dream consolidation
▓▒░
L3
COLD
Long-term — demoted from L2, TF-IDF clustering target
▒░░
L4
PATTERN
Consolidated patterns — promoted by the compactor, + user_instinct overlay
◆◇◇
L5
IDENTITY
Human-approved core values — the 4-layer lock the agent never breaks
★★★
Mnemosyne system architecture: Channels to Brain to Tool Executor, six-tier ICMS, inner dialogue, dream consolidation, and the Meta-Harness self-improvement loop
Channels → Brain (context assembly + identity lock) → Tool Executor + 19-provider backend · Inner Dialogue (Planner / Critic / Doer / Evaluator) · Dream Consolidation · the closed Meta-Harness loop.
Observability

An avatar that visualizes its own mind.

Mnemosyne live dashboard — avatar, chat, event stream, goals, and a memory browser across all tiers
  • 29 derived traits, all observableEvery visual property of the SVG avatar maps to one number — no opaque personality engine.
  • Memory browser across all tiersFTS5 search over L1–L5, live per-tier row counts, goals that persist across sessions.
  • Real-time event streammemory_write, tool_call, persona_call — every action logged and inspectable.
  • Survivable datasqlite3 memory.db "SELECT content FROM memories" works with the framework gone.
The avatar's visual contract ↗
Foundations

Standing on real research.

Mnemosyne is engineering, not mysticism — every mechanism traces to a paper. The honest split of shipped vs. experimental vs. research lives in ROADMAP.md.

Preprint · Mnemosyne

A Cognitive OS for Local-First Agents

The architecture writeup: ICMS tiers, the Reflection → Instinct loop, the Meta-Harness, and the eval contract behind every number on this page.

Read the docs
Benchmark · 2024

LOCOMO — Very Long-Term Conversational Memory

Maharana et al. The 10-conversation, ~1,986-question benchmark Mnemosyne's retrieval runner targets, with recorded baselines.

arXiv:2402.17753
Cognitive Science

ACT-R — The Adaptive Character of Thought

Anderson et al. The activation-decay model behind Mnemosyne's per-kind memory decay and Hebbian strength reinforcement.

act-r.psy.cmu.edu
Interpretability · Roadmap

Neural Geometry & Concept Manifolds

Goodfire's manifold work guides the eval-gated roadmap for L4 pattern memory — preserving relational structure, not just flat abstracts.

goodfire.ai/research
Runtime · Nous Research

Hermes Agent

The agent framework Mnemosyne validates against as a drop-in MemoryProvider — discovery, tool routing, and the plugin manifest.

hermes-agent.nousresearch.com
Reproduce

The Eval Harness

Retrieval probe set, the full LOCOMO runner with LM Studio + Mem0 adapters, and the check_regression.py gate — in the lab repo.

atxgreene/mnemosyne-lab
From the Field

Written up on 𝕏.

M
Mnemosyne
@atxgreene
"I built a brain for my AI agent — a six-tier memory that flows both directions, runs entirely local, and survives as plain SQLite. Here's what I learned."
↻ Repost♥ Like↗ Share
⌁ Embed slot ready — paste your X article URL in chat and the live post drops in right here, with this card as the styled fallback.
Quickstart

Ten lines to a first conversation.

bash · ~/projects/mnemosyne
# zero runtime deps — this pulls nothing from PyPI $ pip install mnemosyne-harness $ mnemosyne-serve & # daemon + dashboard $ open http://127.0.0.1:8484/ui # avatar evolves in real time # or drop it into Hermes as a memory provider: $ hermes plugins enable mnemosyne $ hermes plugins list → enabled user 0.1.0 mnemosyne
Verify: python3 tests/test_all.py E2E: bash test-harness.sh Demo: ./demo.sh