Found a ripper memory layer for AI agents — Mnemosyne (AxDSan/mnemosyne, 959⭐). Zero deps, single SQLite file, sub-millisecond recall. Built for Hermes first but works everywhere: Cursor, Claude Code, Codex, OpenWebUI, OpenClaw, any MCP client. One pip install and you're off.
Architecture is the standout — BEAM (Bilevel Episodic-Associative Memory) with three tiers: Working Memory for hot context auto-injected before LLM calls, Episodic Memory for long-term storage using sqlite-vec FTS5 hybrid search (50% vector 30% keyword 20% importance), and a TripleStore for temporal knowledge graphs with version chains. The clever bit: they binarise 384-dim float32 embeddings down to 48 bytes via MIB (Information-Theoretic Binarisation) — 32x compression, Hamming distance computed entirely inside SQLite. No external vector DB, no ANN indices, just one file.
Benchmarks back it up: 98.9% Recall@All@5 on LongMemEval (ICLR 2025), 65.2% on BEAM end-to-end QA at 100K scale (beating Honcho, Hindsight, LIGHT, RAG). Recall holds flat at 20% even at 10M items with 35ms latency and 7.2MB storage. 100% abstention accuracy — it just says "dunno" instead of hallucinating.
For Hermes users there's a dedicated plugin (mnemosyne-hermes) exposing 23 tools across core memory (remember, recall, sleep, stats), knowledge graph (triple_add, triple_query, graph_query, graph_link), multi-agent shared memory, working notes scratchpad, and ops (export/import/diagnose). Three lifecycle hooks (pre_llm_call, on_session_start, post_tool_call) inject context automatically. Install: pip install mnemosyne-hermes && hermes config set memory.provider mnemosyne && hermes memory setup.
Also runs as a standalone MCP server (mnemosyne mcp) for any MCP-compatible client. OpenAI-compatible embedding endpoint configurable — defaults to bge-small-en-v1.5 but swap to multilingual models for non-English. MIT licensed, active development, Discord community. If you're building agents that need to actually remember stuff across sessions without wrestling with Postgres/Qdrant/Docker stacks, this is the one.