How-to Guides4 How to Turn Your Repo into Graph

Create a knowledge graph from your codebase

Difficulty: Beginner

Overview

Cognee offers a simple way to build a code graph from your repositories. Once generated, this graph makes it easier to navigate and query your code.


Below, you’ll learn how to install cognee, analyze a codebase using a code graph pipeline, and run a code-based search query in just a few easy steps.

Here is a quick-start guide.

Step 1: Install cognee
pip install cognee[codegraph]

We begin by installing the cognee[codegraph] package, which contains all the dependencies for generating and analyzing code graphs.

Step 2: Set Environment Variables
  import os
  os.environ["LLM_API_KEY"] = "sk-"  # Replace with your actual API key

Remember to replace “sk-” with your actual key before running the notebook. Here’s a guide on how to get your OpenAI API key.

In case you want to use another provider, please set the env variables. See an example for Mistral here.

Step 3: Clone a Repo to Analyze
git clone https://github.com/hande-k/simple-repo.git # Replace it with your preferred repo
repo_path = "/{path-to-your-repo}/simple-repo" 

This sample repo will be used by cognee to build a code graph. Adjust repo_path if you clone to a different location.

Step 4: Create & Run the Code Graph Pipeline
import cognee
from cognee.api.v1.cognify.code_graph_pipeline import run_code_graph_pipeline
 
async def codify(repo_path: str):
    print("\nStarting code graph pipeline...")
    async for result in run_code_graph_pipeline(repo_path, False):
        print(result)
    print("\nPipeline completed!")
await codify(repo_path)

Running this pipeline analyzes the code in the repo and constructs an internal graph representation for quick navigation and searching.

Step 5: Set Up Summarization Prompt
  with open("summarize_search_results.txt", "w") as f:
  f.write(
      "You are a helpful assistant that understands the given user query "
      "and the results returned based on the query. Provide a concise, "
      "short, to-the-point user-friendly explanation based on these."
  )
 

We create a text file containing a system prompt. The language model will use this prompt to summarize search results.

Step 6: A Helper Function for Search & Summary
 from cognee.modules.search.types import SearchType
 from cognee.infrastructure.llm.prompts import read_query_prompt
 from cognee.infrastructure.llm.get_llm_client import get_llm_client
 
 async def retrieve_and_generate_answer(query: str) -> str:
     search_results = await cognee.search(query_type=SearchType.CODE, query_text=query)
     prompt_path = "/content/summarize_search_results.txt"
     system_prompt = read_query_prompt(prompt_path)
     llm_client = get_llm_client()
 
     answer = await llm_client.acreate_structured_output(
         text_input = (
             f"Search Results:\n{str(search_results)}\n\n"
             f"User Query:\n{query}\n"
         ),
         system_prompt=system_prompt,
         response_model=str,
     )
     return answer
 

With this function, cognee retrieves code-based search results for a user query, and the language model converts them into a concise explanation.

Step 7: Run a Sample Query
user_query = "add_your_query" # Replace it with your query
answer = await retrieve_and_generate_answer(user_query)
print("===== ANSWER =====")
print(answer)

Cognee uses its code graph to find relevant references, and the language model produces a short, user-friendly answer.

We hope this quick walkthrough helps you get started with cognee’s code graph and search capabilities. Experiment with different codebases and queries to see the full power of cognee in action!

Join the Conversation!

Have questions or need more help? Join our community to connect with professionals, share insights, and get your questions answered!