code-graph-rag: what it is, what problem it solves & why it's gaining traction

code-graph-rag: what it is, what problem it solves & why it's gaining traction

What it solves

Code-Graph-RAG provides a way to analyze and query large, multi-language codebases using natural language. It solves the problem of understanding complex project structures and relationships (like function calls and class hierarchies) that are often difficult to navigate manually or with simple text-based search.

How it works

The system uses Tree-sitter for robust AST (Abstract Syntax Tree) parsing across multiple languages. It then builds a comprehensive knowledge graph of the codebase structure, which is stored in Memgraph. Users can ask questions in plain English, which the system translates into Cypher queries (the graph database query language) using AI models from providers like Google Gemini, OpenAI, or local models via Ollama. It can also retrieve specific source code snippets and perform surgical code replacements based on the AST.

Who it’s for

This tool is designed for developers who need to navigate, understand, and optimize large existing codebases, especially those spanning multiple languages.

Highlights

  • Multi-Language Support: Full support for C, C++, Java, JavaScript, TypeScript, Python, Rust, PHP, and Lua, with Go and Scala in development.
  • C-Family Support: Deep analysis of C++ templates, operator overloading, and C preprocessor includes.
  • AI-Powered Querying: Natural language to Cypher translation for querying codebase structure.
  • Surgical Editing: AST-based function targeting for precise code modifications.
  • Real-time Updates: A watcher that automatically synchronizes the knowledge graph as code changes.
  • Flexible AI Backend: Supports cloud (Gemini, OpenAI) and local (Ollama) LLMs.
  • Shell Integration: Ability to execute terminal commands for testing or CLI tool usage.

Sources