code-review-graph: what it is, what problem it solves & why it's gaining traction

code-review-graph: what it is, what problem it solves & why it's gaining traction

What it solves

AI coding assistants often waste tokens and lose precision by re-reading large portions of a codebase during review tasks. code-review-graph reduces this token waste by providing AI tools with a precise, structural map of the code, ensuring the assistant reads only the files and functions that are actually relevant to a specific change.

How it works

The tool uses Tree-sitter to parse repositories into an Abstract Syntax Tree (AST), which is then stored as a graph of nodes (functions, classes, imports) and edges (calls, inheritance, test coverage) in a SQLite database.

Key mechanisms include:

  • Blast-radius analysis: When a file changes, the graph traces all callers, dependents, and tests affected by that change, creating a "minimal review set" for the AI.
  • Incremental updates: The system diffs changed files and re-parses only what is necessary, allowing large projects to re-index in under 2 seconds.
  • MCP Integration: It uses the Model Context Protocol (MCP) to serve this graph data to AI assistants (like Cursor, Claude Code, or Zed), allowing them to query the graph instead of scanning the entire corpus.
  • CI Integration: A GitHub Action can build the graph on a CI runner to post risk-scored PR reviews and identify test gaps without sending source code to external services.

Who it’s for

  • Developers using AI coding tools who want to reduce token costs and improve review accuracy.
  • Maintainers of large monorepos where full-context windows are insufficient or too expensive.
  • DevOps/CI engineers looking to automate risk-scored pull request reviews.

Highlights

  • Massive Token Reduction: Benchmarks show a median per-question token reduction of ~82x.
  • Broad Language Support: Supports a vast array of languages including Python, JS/TS, Go, Rust, Java, C++, and even Jupyter/Databricks notebooks.
  • Extensible: New languages can be added via a simple .toml configuration without needing to fork the project.
  • Local-First: The knowledge graph is stored locally in SQLite, ensuring privacy and security.
  • Advanced Analysis: Includes community detection (Leiden algorithm), hub/bridge detection, and execution flow tracing.

Sources