semble: what it is, what problem it solves & why it's gaining traction

What it solves

Semble provides fast, token-efficient code search for AI agents. It solves the problem of agents needing to explore large codebases without consuming massive amounts of tokens by reading entire files or relying on imprecise grep-style searches. It allows agents to retrieve only the exact code snippets relevant to a natural-language query, reducing token usage by up to 98% compared to the traditional grep-and-read approach.

How it works

Semble uses a hybrid retrieval system that runs entirely on CPU without requiring GPUs or API keys. It processes codebases by splitting files into code-aware chunks using tree-sitter and then employs two complementary search methods:

Semantic Search: Uses static Model2Vec embeddings (via the potion-code-16M model) for semantic similarity.
Lexical Search: Uses BM25 for exact matches on identifiers and API names.

These results are fused using Reciprocal Rank Fusion (RRF) and then refined by a set of code-aware ranking signals, such as boosting definitions over references and penalizing noise (e.g., test files or legacy shims).

Who it’s for

Developers building or using AI coding agents (such as Claude Code, Cursor, Codex, or OpenCode) who want to provide their agents with instant, low-cost access to any local or remote repository.

Highlights

Extreme Speed: Indexes repositories in ~250ms and answers queries in ~1.5ms on CPU.
Token Efficiency: Returns only relevant chunks, significantly reducing the context window load for agents.
Zero Setup: No API keys, GPUs, or external services required.
MCP Server Support: Integrates as a Model Context Protocol (MCP) server, allowing agents to call search as a native tool.
Flexible Indexing: Supports local paths and git URLs, with automatic cache invalidation based on file changes.

semble: what it is, what problem it solves & why it's gaining traction

semble: what it is, what problem it solves & why it's gaining traction

What it solves

How it works

Who it’s for

Highlights

Sources