honcho: what it is, what problem it solves & why it's gaining traction
honcho: what it is, what problem it solves & why it's gaining traction
What it solves
Honcho provides memory infrastructure for stateful AI agents, allowing them to maintain a persistent understanding of people, other agents, groups, and projects over time. It moves beyond simple chunk-matching (RAG) by extracting conclusions and reasoning about entities as they evolve.
How it works
Honcho operates as a FastAPI server (managed or self-hosted) that organizes data around "peers" (humans or AI). The system follows a specific loop:
- Store: Conversations, events, and documents are saved as messages within sessions.
- Reason: A background process (the deriver) asynchronously analyzes these messages to update peer representations.
- Query: Users can query these representations via a chat endpoint, search for specific information using hybrid search (BM25 + vector), or pull prompt-ready context.
- Inject: The resulting context or insights are injected into any LLM call or agent framework.
Who it’s for
- Developers building AI agents that require long-term memory and high retention.
- Teams creating multi-agent systems where agents need to understand the relationships and perspectives between different peers.
- Users of MCP-compatible clients (like Claude Code, Cursor, or Windsurf) who want persistent memory for their coding agents.
Highlights
- Reasoning-first memory: Extracts deductive and inductive conclusions rather than just retrieving text chunks.
- Peer-centric model: Tracks entities (users, agents, ideas) and how they change over time.
- Multi-peer perspective: Can model what one specific peer knows about another.
- Broad Integration: Supports MCP, Claude Code, OpenCode, OpenClaw, and Hermes.
- Flexible Deployment: Available as a managed service or self-hosted via Docker.
Sources
- undefinedplastic-labs/honcho